https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007941
Screening and
Diagnostic Testing
Early Diagnosis of Disease
•Prompt attention to the earliest symptoms
•Detection of disease in asymptomatic individuals
Screening and diagnostic tests improve the ability to
estimate the probability of the presence or absence of
a disease
2
Screening vs. Diagnostic Tests
• Screening Tests
• Tests performed on asymptomatic individuals with the goal of detecting pre-clinical
cases of disease
• Reserved for persons apparently well
• Case-Finding
• Seeking additional illnesses in those with medical problems
• Diagnostic Tests
• Tests performed to increase probability of disease identification and confirmation in
cases of suspected disease
3
Common Screening Tests
• Pap smear for cervical cancer
• Fasting blood sugar for diabetes
• PSA for prostate cancer
• Ocular pressure for glaucoma
• PKU test for phenolketonuria in newborns
• TSH for thyroid disorders
• BP for hypertension
4
Screening Success Stories
• Cytology screening for cervical caner
• Inadequate screening in developing countries accounts for
a large proportion of invasive cervical cancer cases
• Interval continues to change
• Screening for hypertension
• Screening for HIV-1, Hep-B and syphilis in pregnant women
5
The Progress of Disease
Disease or precursor detectable by screening
Disease
begins
pre-clinical
Symptoms
begin
Exposure
Death
Screening
Test +
lead time
Disease confirmed by diagnostic testing
“Gold standard”
6
Screening and Diagnostic Tests
•Breast Cancer
• Clinical Breast Exam
• Screening Mammogram
• Diagnostic Mammogram
• Fine Needle Aspiration Biopsy
• Core Biopsy
• Excisional Biopsy (gold standard)
7
Considerations for Screening Programs
1.
2.
3.
4.
5.
6.
7.
8.
9.
The disease should be a significant public health problem
There should be a recognizable latent or early symptomatic stage
There should be a suitable screening test acceptable to the population
There should be well-established and available diagnostic tests
There should be an accepted treatment for the disease
Facilities for diagnosis and treatment should be available
The cost of case-finding, diagnosis, and treatment should be anticipated and
reasonable to the payer/patient
The process should be regular and on-going
Test should have a reasonable cut-off level defined and be both valid and reliable
8
Participation in Screening Programs
•
•
•
•
The disease must be known to the individual
It must be regarded as a serious threat to health
Each individual must feel vulnerable to the disease
There must be a firm belief that action will have meaningful
results
9
The Screening 2X2 Table
Disease
No Disease
Test Positive
a
True-Positives
b
False-Positives
Test Negative
c
False-Negative
d
True-Negatives
𝒂+𝒄
Prevalence of Disease =
𝒂+𝒃+𝒄+𝒅
10
Test Validity
• Ability of the screening test to accurately identify diseased from non-disease individuals
• An ideal test is highly sensitive and specific
• Recall: difference between validity and reliability?
11
Sensitivity
• The probability that a diseased person (case) in the population tested will be identified as diseased by the test
• Proportion of persons with the disease who have a positive test
• Ability of a test to correctly identify persons with the disease
Disease
No Disease
Total
Test Positive
a
True-Positives
b
False-Positives
a+b
Test Negative
c
False-Negative
d
True-Negatives
c+d
Total
a+c
b+d
a+b+c+d
𝒂
Sensitivity =
𝒂+𝒄
12
Specificity
• The probability that a person without the disease (noncase) will be correctly identified as nondiseased by the test
• Proportion of persons without the disease who have a negative test
• Ability of a test to correctly identify persons without the disease
Disease
No Disease
Total
Test Positive
a
True-Positives
b
False-Positives
a+b
Test Negative
c
False-Negative
d
True-Negatives
c+d
Total
a+c
b+d
a+b+c+d
𝒅
Specificity =
𝒃+𝒅
13
Important!
• Determination of the sensitivity and specificity of a test requires that a
diagnosis of disease be established or ruled out for every person tested by the
screening procedure, regardless of whether he screens negative or positive
• The diagnosis must be established by techniques independent of the
screening test
14
Sensitivity and Specificity are
descriptors of the accuracy of a test
• Sensitivity
• The greater the sensitivity, the more likely the tests will detect persons with the disease.
• A negative result on a test with excellent sensitivity can virtually rule out disease.
• Specificity
• The greater the specificity, the more likely it is that persons without the disease will be excluded.
• A positive result on a test with excellent specificity will strongly suggest the presence of disease.
15
Sensitivity and Specificity: Example
Diabetes
No Diabetes
Glucose Tolerance
Positive
34
20
Glucose Tolerance
Negative
116
9,380
𝟑𝟒
Sensitivity =
𝟑𝟒+𝟏𝟏𝟔
𝟗,𝟑𝟖𝟎
Specificity =
𝟐𝟎+𝟗,𝟑𝟖𝟎
= 22.6%
= 99.7%
16
Predictive Values are estimates of the probability of the
presence or absence of disease based on the test result
• Positive Predictive Value (PPV)
• The percentage of persons with positive test results who actually have the disease
• How likely is it that the disease of interest is present if the test is positive?
• Negative Predictive Value (NPV)
• The percentage of persons with negative test results who do not have the disease of interest
• How likely is it that the disease of interest is not present if the test is negative?
17
Predictive Value
Disease
No Disease
Total
Test Positive
a
True-Positives
b
False-Positives
a+b
Test Negative
c
False-Negative
d
True-Negatives
c+d
Total
a+c
b+d
a+b+c+d
𝒂
PPV (PV+) =
𝒂+𝒃
𝒅
NPV (PV-) =
𝒄+𝒅
18
Predictive Value: Example
Glaucoma
No Glaucoma
Intraocular Pressure
Positive
140
80
Intraocular Pressure
Negative
10
910
𝟏𝟒𝟎
PPV (PV+) =
𝟏𝟒𝟎+𝟖𝟎
𝟗𝟏𝟎
NPV (PV-) =
𝟏𝟎+𝟗𝟏𝟎
= 64%
= 99%
19
Comparison of Measures
• Sensitivity and Specificity
• Tells us how well the test is at correctly identifying people with and without the
disease in a population
• Important in population setting for public health
• Predictive values
• Tells us the probability of disease (or non-disease) given a positive (or negative)
test result
• Important in the clinical setting
20
An Example
A manufacturer would like to sell you a new rapid screening test developed to screen for
strep throat. You know the prevalence of strep throat in your pediatric population in
the high peak season is 27%. The manufacturer of the new test describes the
sensitivity as 70% and the specificity as 73%. Assuming that you will use this test
with 1,000 children, what are the positive and negative predictive values of this test
in your population? Would you buy this product?
Strep Throat
No Strep Throat
Total
Test Positive
189
197
386
Test Negative
81
533
614
Total
270
730
1,000
21
An Example
Strep Throat
No Strep Throat
Total
Test Positive
189
197
386
Test Negative
81
533
614
Total
270
730
1,000
𝟏𝟖𝟗
PPV (PV+) =
𝟑𝟖𝟔
𝟓𝟑𝟑
NPV (PV-) =
𝟔𝟏𝟒
= 49%
= 87%
22
Trade-Offs Between Sensitivity and Specificity
• The ideal test should perfectly discriminate between those with and without disease
hence test distributions do not overlap
• Yet, in nature distributions do overlap
• Hence, choice of cut-off defining normal values vs. abnormal determines sensitivity and specificity
Example: cut-off for abnormal blood glucose, point X
produces perfect sensitivity identifying all with disease
yet poor specificity. Choosing a higher cut-point, e.g.
point Z, yields the opposite results, healthy persons
correctly identified as healthy (perfect specificity) but
the cost is missing a proportion of ill persons. Hence,
compromise at point Y.
23
Cut-Points for Screening Tests
• Screening Tests with Categorical Results:
• Mammography:
• BIRADS 1: negative
• BIRADS 2: benign
• BIRADS 3: probably benign
• BIRADS 4: suspicious for cancer
• BIRADS 5: highly suggestive for malignancy
• What is Abnormal?
• The decision about what results to call “abnormal” will effect sensitivity, specificity, and predictive
values of your screening tests.
24
Cut-Points for Screening Tests
• Screening Tests with Continuous Results:
• Blood Pressure
• Cholesterol Levels
• Blood sugar
• What is Abnormal?
• There are many options concerning where to set the cut-off point
• Along a continuous scale, different cut-off points will result in differing levels of sensitivity and specificity
• As sensitivity increases, specificity decreases
• Low cut-points are very sensitive, but not specific
• Those with disease are correctly classified, but those without disease are not
• High cut-points are very specific, but not sensitive
• Those without disease are correctly classified, but those with disease are not
• How to you decide the cut-off point?
25
General Pattern of Relationship
26
Predictive Values are Influenced by
Prevalence of Disease
Disease
No Disease
Test +
36
48
Test –
4
912
1,000
Disease
No Disease
Test +
9
50
Test –
1
940
Prevalence = 40/1,000 = 4%
Sensitivity = 36/40 = 90%
Specificity = 912/960 = 95%
Prevalence = 10/1,000 = 1%
Sensitivity = 9/10 = 90%
Specificity = 940/990 = 95%
PPV (PV+) = 36/84 = 43%
NPV (PV-) = 912/916 = 99.5%
PPV (PV+) = 9/59 = 15.3%
NPV (PV-) = 940/941 = 99.8%
1,000
27
Adjusted Predictive Measures
• Numbers with and without disease are constrained by the study design
• Prevalence can not be estimated from the study
• Use background estimates in the calculation
Adjusted PPV = (sensitivity x prevalence)/[(sensitivity x prevalence) + ((1specificity) x (1 – prevalence))]
Adjusted NPV = (specificity x (1 – prevalence))/[(specificity x (1prevalence)) + ((1-sensitivity) X prevalence)]
28
Adjusted Predictive Values
Intraocular Pressure
Positive
140
80
Intraocular Pressure
Negative
10
910
Prevalence (from other sources) = 20%
Prevalence (from other sources) = 40%
Adjusted PPV = 74.3% (vs. PPV 64%)
Adjusted NPV = 98.2% (vs. PPV 98.9%)
Adjusted PPV = 88.5% (vs. PPV 64%)
Adjusted NPV = 95.4% (vs. PPV 98.9%)
29
Likelihood Ratios
The probability of a particular test result for a person with the disease
=
The probability of a particular test result for a person without the disease
• Likelihood ratios do not vary with prevalence
30
Likelihood Ratios
• Likelihood Ratio for a Positive Test
=
𝐓𝐡𝐞 𝐩𝐫𝐨𝐛𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐨𝐟 𝐚 𝐩𝐨𝐬𝐢𝐭𝐢𝐯𝐞 𝐭𝐞𝐬𝐭 𝐫𝐞𝐬𝐮𝐥𝐭 𝐟𝐨𝐫 𝐚 𝐩𝐞𝐫𝐬𝐨𝐧 𝐰𝐢𝐭𝐡 𝐭𝐡𝐞 𝐝𝐢𝐬𝐞𝐚𝐬𝐞
𝐓𝐡𝐞 𝐩𝐫𝐨𝐛𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐨𝐟 𝐚 𝐩𝐨𝐬𝐢𝐭𝐢𝐯𝐞 𝐭𝐞𝐬𝐭 𝐫𝐞𝐬𝐮𝐥𝐭 𝐟𝐨𝐫 𝐚 𝐩𝐞𝐫𝐬𝐨𝐧 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐭𝐡𝐞 𝐝𝐢𝐬𝐞𝐚𝐬𝐞
• The larger the size of the LR+, the better the diagnostic value of the test
• An LR+ value of 10 or greater is considered a good test
• Likelihood Ratio for a Negative Test
𝐓𝐡𝐞 𝐩𝐫𝐨𝐛𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐨𝐟 𝐚 𝐧𝐞𝐠𝐚𝐭𝐢𝐯𝐞 𝐭𝐞𝐬𝐭 𝐫𝐞𝐬𝐮𝐥𝐭 𝐟𝐨𝐫 𝐚 𝐩𝐞𝐫𝐬𝐨𝐧 𝐰𝐢𝐭𝐡 𝐭𝐡𝐞 𝐝𝐢𝐬𝐞𝐚𝐬𝐞
= 𝐓𝐡𝐞
𝐩𝐫𝐨𝐛𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐨𝐟 𝐚 𝐧𝐞𝐠𝐚𝐭𝐢𝐯𝐞 𝐭𝐞𝐬𝐭 𝐫𝐞𝐬𝐮𝐥𝐭 𝐟𝐨𝐫 𝐚 𝐩𝐞𝐫𝐬𝐨𝐧 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐭𝐡𝐞 𝐝𝐢𝐬𝐞𝐚𝐬𝐞
• The smaller the size of the LR-, the better diagnostic value of the test
• An LR- value of 0.10 or less is considered a good test
31
Likelihood Ratio
Disease
No Disease
Test Positive
a
True-Positives
b
False-Positives
Test Negative
c
False-Negative
d
True-Negatives
Likelihood ratio for a positive test (LR+) =
LR+ =
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚
(𝟏−𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚)
𝒂/(𝒂+𝒄)
𝒃/(𝒃+𝒅)
Likelihood ratio for a negative test (LR-) =
LR- =
𝒄/(𝒂+𝒄)
𝒅/(𝒃+𝒅)
(𝟏 − 𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚)
𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚
32
Likelihood Ratio is Not Influenced by Prevalence
Disease
No Disease
Test +
36
48
Test –
4
912
Prevalence = 40/1,000 = 4%
Sensitivity = 36/40 = 90%
Specificity = 912/960 = 95%
1,000
Disease
No Disease
Test +
9
50
Test –
1
940
1,000
Prevalence = 10/1,000 = 1%
Sensitivity = 9/10 = 90%
Specificity = 940/990 = 95%
LR+ =
𝟑𝟔/𝟒𝟎
= 18
𝟒𝟖/𝟗𝟔𝟎
LR+ =
𝟗/𝟏𝟎
= 18
𝟓𝟎/𝟗𝟗𝟎
LR- =
𝟒/𝟒𝟎
= .10
𝟗𝟏𝟐/𝟗𝟔𝟎
LR- =
𝟏/𝟏𝟎
= .10
𝟗𝟒𝟎/𝟗𝟗𝟎
33
Yield
•
•
•
•
•
The yield of a screening test is the amount of previously unrecognized disease
that is diagnosed with screening
Yield is influenced by:
• The sensitivity of the test
• The prevalence of unrecognized disease in the population
In screening tests, a high positive predictive value is desirable.
However, if the prevalence of a disease is low, even a highly sensitive test will
yield a low positive predictive value
For the most yield, screening should be aimed at populations with a high
prevalence of disease
34
Diagnostic Accuracy
• An attempt to simplify the four indices of test validity into a single term
• Example: proportion of correct results or the sum of correctly identified ill and well
divided by all tested, or (a+d)/(a+b+c+d)
35
The Evaluation of Screening Programs
• Does early detection of disease:
• Reduce morbidity?
• Reduce mortality?
• Improve quality of life?
• Reduce cost of disease?
36
Concern of Overdiagnosis in Cancer
(Welch and Black, JNCI (2010))
• Overdiagnosis: a condition is diagnosed that would have
otherwise not gone on to cause symptoms or disease due to one
of the following:
• The cancer never progresses
• The cancer progresses slowly enough that the patient dies of other causes
before the cancer becomes symptomatic
Note: Please read article in Canvas.
37
The Potential Harms of Overdiagnosis
• Patient can not hope to benefit from the detection of the indolent
lesion
• Potential harm of early or unnecessary treatment
• Physiological harms and impact on work and productivity
• E.g. study of HTN in an industrial setting found that after screened persons
were diagnosed with HBP, annual absenteeism increased nearly 80%
regardless of severity or treatment plan
38
Potential Biases
• Observational studies compare the outcome of person who are
screened with the outcomes of those who are not screened
• Possible biases:
• Lead time bias
• Length bias
• Volunteer bias
39
Lead Time Bias
• Occurs when screening detects disease earlier in its course (e.g. during the
latent period) than if screening had not been performed (disease identified via
clinical symptoms)
• Biases the comparison that are not adjusted for timing of diagnosis
• Length of time from diagnosis to death is increased
• Length of life may not be increased if treatment is ineffective
• Occurs in several screenings for cancer
• Can avoid/reduce by measuring mortality not survival. What’s the difference?
Lead Time Bias
Without early detection
Actual onset of
cancer, age 40
Diagnosis of cancer,
4 yrs later
Death 6 years after
diagnosis, 10 years
after onset, age 50
Detection of cancer,
2 yr later
Death 8 years after
detection, 10 years
after onset, age 50
With early detection
Actual onset of
cancer, age 40
41
Example: Mayo Lung Project
• An RCT comparing chest x-ray
plus cytology vs. Standard of
Care for lung cancer screening
• 5-yr survival appeared to
strongly support screening
• Lung cancer mortality rates were
not different even trend toward
higher death rate in screened
group
Figure source: Croswell et al., Semin
Oncol, 2010
42
Length Bias
• Screening picks up prevalent disease and prevalence = incidence * duration
• Slowing growing disease having greater duration in pre-symptomatic phase,
therefore larger prevalence
• People with less aggressive/slowly developing disease survive longer
• People with less aggressive disease are over-represented among cases
identified via screening
• Probability that disease will be detected via screening is directly proportional to length of
its detectable preclinical phase, which is inversely related to rate of progression
• People with less aggressive disease have better survival after screening
43
Volunteer Bias
• People who volunteer for screening differ from those who do not volunteer
• People who are healthier or more health-conscious are more likely to obtain recommended
screening compared to those who do not obtain screening
• Avoid by evaluating screening in an RCT
• Otherwise, control for confounders associated with obtaining screening
and outcome such as family history of disease, level of health concern,
other health behaviors, baseline health/illnesses
• Many examples: Multiple risk factor intervention trial, colon cancer control
study, the Physicians health study all with substantially lower-than-expected
mortality compared to overall population
44