Michael DeWine, governor of the state of Ohio in the United States, provided a visible example of the properties of screening tests.

*Gov. Mike DeWine of Ohio Tests Positive, Then Negative, for Coronavirus* by Sarah Mervosh, The New York Times (6 August 2020 with 7 August 2020 update).

After testing positive for COVID-19 on a rapid antigen test, he missed an opportunity to meet with the US president who was visiting DeWine’s state. After DeWine was tested again using a slower, more accurate (RT-PCR) test, he was negative for COVID-19. A additional test administered a day later also was negative.

Are there benefits in having a rapid, less accurate test as well as having a slower, more accurate test? Let’s consider what accuracy means in these tests and why you might be willing to tolerate different errors at different times.

I won’t address how these tests are evaluating different biological endpoints. I’ve been impressed at how national and local sources have worked to explain the differences between tests that look for particular protein segments or for genetic material characteristic of the virus. Richard Harris (National Public Radio in the US) also provided a nice discussion of reliability of COVID-19 tests that might be of interest.

I want to talk about mistakes, errors in testing. No test is perfectly accurate. Accuracy is good but accuracy can be defined in different ways, particularly in ways that reflect errors in decisions that are made. Two simple errors are commonly used when describing screening tests – saying someone has a disease when, in truth, they don’t (sorry Governor DeWine) or saying someone is disease free when, in truth, they have the disease. Governor DeWine had 3 COVID-19 tests – the first rapid test was positive, the second and third tests were negative. Thus, we assume his true health status is disease free.

These errors are called false positive and false negative errors. (For those of you who took introductory statistics class in a past life, these errors may have been labeled differently: false positive error = Type I error and false negative error = Type II error.) Testing concepts include the complements of these errors – sensitivity is the probability a test is positive for people with the disease (1 – false negative error rate) and specificity is the probability a test is negative for disease-free people (1 – false positive rate). If error rates are low, sensitivity and specificity are high.

It is important to recognize these errors can only be made when testing distinct groups of people. A false positive error *only* can be made when testing disease-free people. A false negative error *only* can be made when testing people with the disease. An additional challenge is that the real questions people want to ask are “Do I have the disease if I test positive?” and “Am I disease free if my test is negative?” Notice these questions involve the consideration of two other groups –people who *test* positive and people who *test* negative!

## Understanding the probabilities

Probability calculations can be used to understand the probability of having a disease given a positive test result — if you know the false positive error rate, the false negative error rate and the percentage of the population with the disease, along with testing status of a hypothetical population. The British Medical Journal (BMJ) provides a nice web calculator for exploring the probability that a randomly selected person from a population has the disease for different test characteristics. In addition, the app interprets the probabilities in terms of counts of individuals from a hypothetical population of 100 people classified into 4 groups based upon true disease status (disease, no disease) and screening test result (positive, negative).

It is worth noting that these probabilities are rarely (if ever) known and can be very hard to estimate – particularly when changing. In real life, there are serious challenges in estimating the numbers that we get fed into calculators such as this – but that’s beyond scope of this post. Regardless, it is fun and educational to play around with the calculator to understand how things work.

These error rates vary between different test types and even for tests of the same type. One challenge that I had in writing this blog post was obtaining error rates for these different tests. Richard Harris (NPR) reported that PCR false positives from the PCR test were approximately 2%, with variation attributable to the laboratory conducting the study and the test. National Public Radio reported that one rapid COVID-19 test had a false negative error rate of approximately 15% while better tests have false negative tests less than 3%. One complicating factor is that error rates appear to depend on when the test is given in the course of disease.

## Examples

The following examples illustrate a comparison of tests with different accuracies in communities with different disease prevalence.

### Community with low rate of infection

A recent story about testing in my local paper reported 1.4% to 1.8% of donors to the American Red Cross had COVID-19. Considering a hypothetical population with 100 people, only 2 people in the population would have the disease and 98 would be disease free.

**Rapid, less accurate test**: Suppose we have a rapid test with a 10% false positive error rate (90% specificity), 15% false negative error rate (85% sensitivity) and 2% of people tested are truly positive. With these error rates, suppose both of the people with the disease test positive and 10 of the 98 disease-free people test positive. Based on this, a person with a positive test (2 + 10= 12) has about a 16% (2/12 x 100) chance of having the disease, absent any other information about exposure.

Disease | No Disease | Total | ||

Test + | 2 | 10 (98 x .10) | 12 | 2/12 (16%) |

Test – | 0 (2 x .15) | 88 | 89 | |

Total | 2 | 98 | 100 |

*For a hypothetical population of 100 people with 2% infected, a false positive rate of 0.10, and a false negative rate of 0.15, the chance of having the disease given a positive test is about 16%.*

**Slower, more accurate test**: Now, suppose we have a more accurate test with a 2% false positive error rate (98% specificity) and 1% false negative error rate (99% sensitivity). With these error rates, both of the people with the disease test positive and 2 of the 98 disease-free people test positive. Based on this, a person with a positive test (2 + 2= 4) has about a 50% (2/4) chance of having the disease.

Disease | No Disease | Total | ||

Test + | 2 | 2 (98 x .02) | 4 | 2/4 (50%) |

Test – | 0 (2 x .01) | 96 | 96 | |

Total | 2 | 98 | 100 |

*For a hypothetical population of 100 people with 2% infected, a false positive rate of 0.02, and a false negative rate of 0.01, the chance of having the disease given a positive test is about 50%.*

### Community with a higher rate of infection

Now suppose we test in a community where 20% have the disease. Here, 20 people in the hypothetical population of 100 have the disease and 80 are disease free. This 20% was based on a different news source suggesting that 20% was one of the highest proportions of COVID-19 in a community in the US.

**Rapid, less accurate test**: Consider what happens we use a rapid test with a 10% false positive error rate (90% specificity) and 15% false negative error rate (85% sensitivity) in this population. With the error rates described for this test, 17 of the 20 people with disease test positive and 8 of the 80 disease-free people test positive. Based on this, a person with a positive test (17 + 8 = 25) has about a 68% (17/25) chance of having the disease without any additional information about exposure.

Disease | No Disease | Total | ||

Test + | 17 | 8 (80 x .10) | 25 | 17/25 (68%) |

Test – | 3 (20 x .15) | 72 | 75 | |

Total | 20 | 80 | 100 |

*For a hypothetical population of 100 people with 20% infected, a false positive rate of 0.10, and a false negative rate of 0.15, the chance of having the disease given a positive test is about 68%.*

**Slower, more accurate test**: Now suppose we apply a more accurate test with a 2% false positive error rate (98% specificity) and 1% false negative error rate (99% sensitivity) to the same population. In this case, all 20 people with the disease test positive and 2 of the 80 disease-free people test positive. Based on this, a person with a positive test (20 + 2 = 22) has about a 90% (20/22) chance of having the disease.

Disease | No Disease | Total | ||

Test + | 20 | 2 (80 x .02) | 22 | 20/22 (90%) |

Test – | 0 (20 x .01) | 78 | 78 | |

Total | 20 | 80 | 100 |

## Returning to the big question

Returning to question posed in the title of this blog post …

So, if you test positive for COVID-19, do you have it? If you live in a community with little disease and use a less accurate rapid test, then you may only have a 1 in 6 chance (16%) of having the disease (absent any additional information about exposure). If you have a more accurate test, then the same test result may be associated with a 50-50 chance of having the disease. Here, you might want to have a more accurate follow up test if you test positive on the rapid, less accurate test. If you live in a community with more people who have the disease, both tests suggest you are more likely than not to have the disease. Recognize that these tests are being applied in situations with additional information being available including whether people exhibit COVID-19 symptoms and/or live or work in communities with others who have tested positive.

## Final thoughts

You might be interested in controlling different kinds of errors with different tests. If you are screening for COVID-19, you might want to minimize false negative errors and accept potentially higher false positive error rates. A false positive error means a healthy disease-free person is quarantined and unnecessarily removed from exposing others. A false negative error means a person with disease is free to mix in the population and infect others. So, does Governor DeWine have COVID-19? Ultimately, the probability that the governor is disease-free reflects the chance of being disease-free given one positive result on a less accurate test and two negative results from more accurate tests. The probability he is disease-free is very close to one, given no other information about exposure.

## To learn more

To read more about natural frequencies in discussing screening test risks:

- Gerg Gigerenzer (2014)
*Risk Savvy: How to Make Good Decisions*. Viking Press. By William Kremer BBC World Service*Do doctors understand test results?*

To read more about natural frequencies / hypothetical populations as part of 7 concepts important for being a statistically literate citizen:

- Jessica Utts (2003) What Educated Citizens Should Know About Statistics and Probability,
*The American Statistician*,**57**:2, 74-79, DOI: 10.1198/0003130031630 - The BMJ site include a good exposition of test accuracy and what physicians need to know about these tests.
*Interpreting a covid-19 test result*

To read more about accuracy of COVID-19 tests:

*COVID-19 Story Tip: Beware of False Negatives in Diagnostic Testing of COVID-19*(Johns Hopkins press release describing work suggesting false negative rates > 20% for RT-PCR tests and that test accuracy changes over time course of disease).*Study Raises Questions About False Negatives From Quick Covid-19 Test*from NPR Morning Edition on April 21, 2020.

To read more about screening tests in a different context – facial recognition systems:

*Live Facial Recognition: how good is it really? We need clarity about the statistics*by David Spiegelhalter and Kevin Mcconway

Bayes’ rule is wonderful for seeing how sensitivity, specificity, and prevalence affect the probability of an infection. However, it is hard to tell what the sensitivity and specificity of the many tests are in practice (http://for-sci-law.blogspot.com/2020/04/he-or-she-tested-negative-or-positive.html), and good estimates of prevalence are difficult to obtain (http://for-sci-law.blogspot.com/2020/04/estimating-prevalence-from-screening.html). This leaves us with considerable uncertainty about predictive value. What can statisticians do to clarify the magnitude of this uncertainty?

If we know the uncertainties involved, we can simulate from them to get a sense of what the uncertainty in the posterior probability of infection is.

Thanks for the comment and sharing the links. You raise important issues in your comment and in your post from 2 April 2020. The confusion about ‘tests for what’ continues. Your figure 1 about explanations of sources of errors in tests is useful. There are other challenges – a person may show different levels of antigen response at different times during the course of a disease -thus, the error rates can differ over time. You also mention issues of unknown sensitivity and specificity, or the complements false negative and false positive error rates. I would add unknown levels of rates of infection. We need to know how common this disease is in a population to evaluate the value of a positive test. So what can we do? I suggest that doing calculations over a plausible range of infection – sensitivity – specificity combinations and reporting the estimates Pr(disease GIVEN + test) over these calculations might be a reasonable step to capture some of this uncertainty.

Thanks again. I also wanted to share a related true story. After a frightening surge in new cases within the staff of the continuing care facility where my parents live, I received the e-mail below from the CEO of that facility. The false alarm represented another type of false positive (generated from a specific testing procedure or result-reporting failure) which is important to consider in our understanding of what a positive or negative test result might mean. As well, it’s a huge reminder of the importance of clear communication on false positives and what testing results really mean. The stress of a positive result is high! Rapid testing is being instituted at many airports – are travelers aware of what their results really mean? And at many colleges – are students aware of what a positive result might really mean? Given that a perfect test is unachievable and that other types of mistakes also occur (as in below), a deeper statistical understanding of the probabilities associated with particular test results can help individuals, e.g. family members, travelers, students, react calmly while also limiting exposure to others and getting more information. Thanks again for the post!

—

“On Monday, August 3, we received the results from the third round of surveillance testing [which] indicated that we had one new Resident who was positive and 18 new Team Members who were positive for COVID-19.

Although these numbers did not seem credible to us, under [State] Department of Public (DPH) Health procedures, we appeared to have no recourse but to accept them. We began immediately to take actions based on the working assumption that we needed to treat these results as correct.

It turned out that several other skilled nursing facilities also showed an unusual spike in positive cases last week, and oddly enough, all these facilities had used the same testing vendor. This caught the attention of the epidemiologists [State] DPH who intervened and instructed the vendor to re-test the samples. On the second test by the same vendor, all 19 of [our] samples were negative for COVID-19.

For quality control purposes, DPH then sent the swabs to a second lab for testing over the weekend. The results of the third test of the same samples confirmed that the 19 samples were negative. The initial test results reported for these 19 samples were incorrect.” (August 11)

Thanks for your comments. The additional component is the populations being tested and the consequence of disease presence in these populations. For COVID-19, the largest percentage of deaths with COVID-19 as a cause occurred in nursing homes and largest number of positives in congregant living situations. This is a testing situation where caution seems warranted. Not only is this a story that highlights communication but it touches on the critical decisions made in the risk management of a disease. Having a rapid test with higher false + error rates might be desirable for NH workers – better to remove from possible exposure of a sensitive population and to then away the results of a more accurate test.

Thank you! First thought, could “known exposure” be, effectively and mathematically, the same idea as transitioning from one’s previous community to a much higher risk community? It might be a useful way to think about the shift in odds that you have COVID-19 given an initial positive test. Much higher with known risk! Rapid tests would therefore be particularly valuable in cases of known exposure because the rate of false positives is low and the risk of false negatives (or delays) is high.

Great point. Easy to imagine two distinct populations given exposure status where each of these populations contains a disease present and disease absent subpopulation:

* Known Exposure to positive case

no disease

disease present

* No known Exposure

no disease

disease present

As you suggest, I would expect Pr(COVID-19 present GIVEN known exposure) > Pr(COVID-19 GIVEN No known exposure)