Tessa Davis. Screening tests and statistics – we’re only doctors, what do we know?, Don't Forget the Bubbles, 2014. Available at:
A letter published in JAMA this week confirms what we already feared in medicine – we really do not understand statistics. Somehow throughout our university and training we have persuaded ourselves that statistics aren’t relevant unless you are conducting your own research. And yet we are delivering advice to patients every day based on our understanding of test accuracy.
We need to know what we are talking about. And we don’t…
Manrai et al asked 24 attendings, 26 house officers, 10 medical students and 1 retired doctor this question….
“If a test to detect a disease, whose prevalence is 1/1,000, has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease?”
Only 23% of those asked got the answer correct, and most common incorrect answer was 95% (an answer given by 44% of respondents).
What’s the explanation?
This question is asking for positive predictive value, which takes into account the prevalence. Most people assume that the false positive directly predicts the chances of the person having the disease if their test is positive.
In a population of 1000 people, there will be 1 person with the disease (prevalence is 1 in 1000).
If you tested the whole population then you would get roughly 50 false positives (5%). And we know that 1 person has the disease (so we can assume they will have a positive test result). So in that population, there will be 51 positive test results.
Therefore if your test comes back positive then there is roughly 1 in 51 chance that you have the disease – approximately 2%.
Does this research letter represent a real problem?
Most people think that patients have a 95% chance of having the disease if the test is positive, when actually they only have a 2% chance. This has huge implications for treatment following screening tests.
Although this research letter is based on a survey, we have recently seen the implications of this in real life too. In January a study was published which investigated use of the 15-minute SAGE tool (self-administered gerocognitive examination) in detecting Alzheimer’s. The newspapers became very excited.
Four out of five people (80 percent) with mild thinking and memory (cognitive) issues will be detected by this test, and 95 percent of people without issues will have normal SAGE scores
SAGE could spot mild thinking and memory issues in 80% of those tested
And to be honest, it doesn’t seem reasonable to expect the press to bust the stats when many of us are equally confused.
What’s the explanation?
As the Telegraph correctly stated, for 80% of people who have MCI, the test will be positive. This is a sensitivity of 80%.
And 95% of healthy people will get the correct diagnosis i.e. the test will be negative. This is a specificity of 95%
Whilst this looks good on first glance, to asses the usefulness of this being a screening test, the prevalence is crucial.
The estimated prevalence of MCI in the over 60s is 5%.
If there are 10,000 people, 500 (5% prevalence) will have MCI. Of these, 400 will test positive (sensitivity of 80%).
Out of the remaining 9,500 healthy people, 475 will get a positive test result (even though they are healthy). This is the specificity of 95%.
David Colquhoun provides some nice charts to explain this visually in his DC’s Improbable Science blog.
So, in a population of 10,000 people (over 60 years old), 875 will test positive. Of these, only 400 will actually have the disease (45%). If your patient has a positive test result, there is only a 45% chance that they actually have the disease). And if you change that to include the whole population regardless of age, only 14% of those who test positive will have the disease
Who is Bayes and what does he have to do with screening?
In the 1700s, Thomas Bayes was the first person to describe using evidence to update beliefs. He described probability as the ‘degree of belief’. When new evidence comes to light then the probability changes.
If your 8 year old tells you they did not eat the chocolate ice cream in the freezer then you would be inclined to believe them. But if you then notice chocolate down her dress, your belief might change somewhat. In many ways, Bayes theorem is just logical thinking.
In screening we aren’t simply asking ‘do you have the disease?’. We are asking ‘given that your test is positive, do you have the disease?’. And this must take into account the false positives and the prevalence.
In Graeme Archer’s Telegraph blog, he creates a lovely graph showing clearly that as prevalence increases, so does the positive predictive value.
Where does that leave us with screening tests?
Screening tests need to be considered in context. Specifically which section of the population is being screened, what the prevalence is for the disease being tested in that population, and how this affects the usefulness of the test.
Perhaps the more relevant question is – where does it leave us with statistics?
For those of use who were not taught it in med school, we need to take responsibility and self-educate. Bob Phillips is doing this in beautiful, bite-sized pieces for Archives of Disease in Childhood with his StatsMiniBlog; and Simon Carley is doing it eloquently as ever at St Emlyn’s – see his Risky Business posts for starters.
Manrai AK, Bhatia G, Strymish J, Kohane IS, Jain SH, Medicine’s uncomfortable relationship with math: calculating positive predictive value, JAMA, online first, 21st April 2014, doi:10.1001/jamainternmed.2014.1059.
Scharre DW, Chang SI, Nagaraja HN, Yager-Schweller J, Mirden RA, Community cognitive screening using the Self-Administered Gerocognitive Examination (SAGE), The Journal of Neuropsychiatry and Clinical Neurosciences 2014; 00:1–7.