Screening tests and statistics – we’re only doctors, what do we know?

37 Flares Twitter 32 Facebook 3 Google+ 2 LinkedIn 0 Email -- Filament.io 37 Flares ×

A letter published in JAMA this week confirms what we already feared in medicine – we really do not understand statistics. Somehow throughout our university and training we have persuaded ourselves that statistics aren’t relevant unless you are conducting your own research. And yet we are delivering advice to patients every day based on our understanding of test accuracy.

We need to know what we are talking about. And we don’t…

Manrai et al asked 24 attendings, 26 house officers, 10 medical students and 1 retired doctor this question….

“If a test to detect a disease, whose prevalence is 1/1,000, has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease?”

Only 23% of those asked got the answer correct, and most common incorrect answer was 95% (an answer given by 44% of respondents).

What's the explanation?

This question is asking for positive predictive value, which takes into account the prevalence. Most people assume that the false positive directly predicts the chances of the person having the disease if their test is positive.

In a population of 1000 people, there will be 1 person with the disease (prevalence is 1 in 1000).

If you tested the whole population then you would get roughly 50 false positives (5%). And we know that 1 person has the disease (so we can assume they will have positive test result). So in that population, there will be 51 positive test results.

Therefore if your test comes back positive then there is roughly 1 in 51 chance that you have the disease – approximately 2%.

Does this research letter represent a real problem?

Most people think that patients have a 95% chance of having the disease if the test is positive, when actually they only have a 2% chance. This has huge implications for treatment following screening tests.

Although this research letter is based on a survey, we have recently seen the implications of this in real life too. In January a study was published which investigated use of the 15-minute SAGE tool (self-administered gerocognitive examination) in detecting Alzheimer’s. The newspapers became very excited.

The Telegraph:

Four out of five people (80 percent) with mild thinking and memory (cognitive) issues will be detected by this test, and 95 percent of people without issues will have normal SAGE scores

Sky News

SAGE could spot mild thinking and memory issues in 80% of those tested

And to be honest, it doesn’t seem reasonable to expect the press to bust the stats when many of us are equally confused.

What's the explanation?

As the Telegraph correctly stated,  for 80% of people who have MCI, the test will be positive. This is a sensitivity of 80%.

And 95% of healthy people will get the correct diagnosis i.e. the test will be negative. This is a specificity of 95%

Whilst this looks good on first glance, to asses the usefulness of this being a screening test, the prevalence is crucial.

The estimated prevalence of MCI in the over 60s is 5%.

If there are 10,000 people, 500 (5% prevalence) will have MCI. Of these, 400 will test positive (sensitivity of 80%).

Out of the remaining 9,500 healthy people, 475 will get a positive test result (even though they are healthy). This is the specificity of 95%.

David Colquhoun provides some nice charts to explain this visually in his DC’s Improbable Science blog.

So, in a population of 10,000 people (over 60 years old), 875 will test positive. Of these, only 400 will actually have the disease (45%). If your patient has a positive test result, there is only a 45% chance that they actually have the disease). And if you change that to include the whole population regardless of age, only 14% of those who test positive will have the disease.

Who is Bayes and what does he have to do with screening?

In the 1700s, Thomas Bayes was the first person to describe using evidence to update beliefs. He described probability as the ‘degree of belief’. When new evidence comes to light then the probability changes.

If your 8 year old tells you they did not eat the chocolate ice cream in the freezer then you would be inclined to believe them. But if you then notice chocolate down her dress, your belief might change somewhat. In many ways, Bayes theorem is just logical thinking.

In screening we aren’t simply asking ‘do you have the disease?’. We are asking ‘given that your test is positive, do you have the disease?’. And this must take into account the false positives and the prevalence.

In Graeme Archer’s Telegraph blog, he creates a lovely graph showing clearly that as prevalence increases, so does the positive predictive value.

 

Where does that leave us with screening tests?

Screening tests need to be considered in context. Specifically which section of the population is being screened, what the prevalence is for the disease being tested in that population, and how this affects the usefulness of the test.

Perhaps the more relevant question is – where does it leave us with statistics?

For those of use who were not taught it in med school, we need to take responsibility and self-educate. Bob Phillips is doing this in beautiful, bite-sized pieces for Archives of Disease in Childhood with his StatsMiniBlog; and Simon Carley is doing it eloquently as ever at St Emlyn’s – see his Risky Business posts for starters.

 

References

Manrai AK, Bhatia G, Strymish J, Kohane IS, Jain SH, Medicine’s uncomfortable relationship with math: calculating positive predictive value, JAMA, online first, 21st April 2014, doi:10.1001/jamainternmed.2014.1059.

Scharre DW, Chang SI, Nagaraja HN, Yager-Schweller J, Mirden RA, Community cognitive screening using the Self-Administered Gerocognitive Examination (SAGE), The Journal of Neuropsychiatry and Clinical Neurosciences 2014; 00:1–7.

Colquhoun D, On the hazards of significance testing. Part 1: the screening problem, DC’s Improbable Science, 2014.

Arches G, False positives and Bayesian reasoning: have I really got dementia? The Telegraph Blogs, 2014.

Evaluating screening tests, the role of probability, Boston University School of Public Health, 2014.

Siegfried T, Doctors flunk quiz on screening-test math, Science News, 2014.

Print Friendly
37 Flares Twitter 32 Facebook 3 Google+ 2 LinkedIn 0 Email -- Filament.io 37 Flares ×

About 

Tessa Davis is a paediatric emergency registrar from Glasgow and Sydney, but currently living in London. Tessa tries to spend time with her 3 kids in between shifts. @tessardavis | + Tessa Davis | Tessa's DFTB posts

10 Responses to "Screening tests and statistics – we’re only doctors, what do we know?"

  1. James Winton
    James Winton 3 years ago .Reply

    Great post. Has inspired me to read some more about stats……..weird.

    • Tessa Davis
      Tessa Davis 3 years ago .Reply

      Thanks James.

  2. ben lawton
    ben lawton 3 years ago .Reply

    Excellent post Tessa. This is a critical concept in emergency medicine – that the prevalence of the disease in the population affects the positive (and negative) predictive values of a test. We need to think about this before we do tests. e.g. Consider a well looking febrile 2 year old who looks like you could probably send them home. The usefulness of a blood culture or a white cell count in someone in 2014 who is up to date with their immunisations is totally different to what it was in the pre-vaccination era not because the test has changed a lot but because the background rate of disease in the population has decreased enormously and therefore a positive test result does not give you the same confidence that a child actually has the disease (the positive predictive value is much lower) so does it really help your decision making? Its a really important concept, thanks for explaining it in such an accessible way

  3. […] Davis’ post “Screening tests and statistics – We’re only doctors, what do we know?” on the Don’t Forget The Bubbles […]

  4. […] Davis’ post “Screening tests and statistics – We’re only doctors, what do we know?” on the Don’t Forget The Bubbles […]

  5. RayS
    RayS 3 years ago .Reply

    Interesting post, Tessa. A comment: You said we can assume the 1 person in a thousand with the disease will have a positive test – this however assumes a 100% sensitivity for the test. If the sensitivity is lower, it will reduce your PPV downwards from 2%. A sensitivity of 50% for example will result in a PPV of 1%. Maybe splitting hairs in this example, but the point is that you need the sensitivity to calculate the PPV. Thanks again.

    • Tessa Davis
      Tessa Davis 3 years ago .Reply

      Yes, you absolutely right. I was trying to simplify it to make it understandable. I gave the best case scenario which is that that 1 person in 1000 will have a positive test.

      But you are correct, that the sensitivity is unlikely to be 100% so the test is even less useful.

  6. Damian Roland
    Damian Roland 3 years ago .Reply

    Thanks Tessa – prompted me to do this for the juniors in my department!
    http://vimeo.com/damianroland/sensitivityandspecificity

  7. […] Forget the Bubbles reminds us that we need to understand our stats to be able to understand the results of the screening tests we order. […]

  8. Casey Parker
    Casey Parker 2 months ago .Reply

    Thanks Tessa. Missed this when you posted- great reminder
    I tried to explain Bayes at SMACCDUB
    http://broomedocs.com/2016/11/smaccdub-bayes-2016-diagnostic-odyssey/

    Also suggest reading Gigerenzers book “Risk Savvy”

Leave a Reply