Spectrum bias

In biostatistics, spectrum bias refers to the phenomenon that the performance of a diagnostic test may vary in different clinical settings because each setting has a different mix of patients.^[1] Because the performance may be dependent on the mix of patients, performance at one clinic may not be predictive of performance at another clinic.^[2] These differences are interpreted as a kind of bias. Mathematically, the spectrum bias is a sampling bias and not a traditional statistical bias; this has led some authors to refer to the phenomenon as spectrum effects,^[3] whilst others maintain it is a bias if the true performance of the test differs from that which is 'expected'.^[2] Usually the performance of a diagnostic test is measured in terms of its sensitivity and specificity and it is changes in these that are considered when referring to spectrum bias. However, other performance measures such as the likelihood ratios may also be affected by spectrum bias.^[2]

Generally spectrum bias is considered to have three causes.^[2] The first is due to a change in the case-mix of those patients with the target disorder (disease) and this affects the sensitivity. The second is due to a change in the case-mix of those without the target disorder (disease-free) and this affects the specificity. The third is due to a change in the prevalence, and this affects both the sensitivity and specificity.^[4] This final cause is not widely appreciated, but there is mounting empirical evidence^[4]^[5] as well as theoretical arguments^[6] which suggest that it does indeed affect a test's performance.

Examples where the sensitivity and specificity change between different sub-groups of patients may be found with the carcinoembryonic antigen test^[7] and urinary dipstick tests.^[8]

Diagnostic test performances reported by some studies may be artificially overestimated if it is a case-control design where a healthy population ('fittest of the fit') is compared with a population with advanced disease ('sickest of the sick'); that is two extreme populations are compared, rather than typical healthy and diseased populations.^[9]

If properly analyzed, recognition of heterogeneity of subgroups can lead to insights about the test's performance in varying populations.^[3]

References

^ Ransohoff DF, Feinstein AR (1978). "Problems of spectrum and bias in evaluating the efficacy of diagnostic tests". N. Engl. J. Med. 299 (17): 926–30. doi:10.1056/NEJM197810262991705. PMID 692598.
^ ^a ^b ^c ^d Willis BH (2008). "Spectrum bias – why clinicians need to be cautious when applying diagnostic test studies". Family Practice. 25 (5): 390–96. doi:10.1093/fampra/cmn051. PMID 18765409.
^ ^a ^b Mulherin SA, Miller WC (2002). "Spectrum bias or spectrum effect? Subgroup variation in diagnostic test evaluation" (PDF). Ann. Intern. Med. 137 (7): 598–602. doi:10.7326/0003-4819-137-7-200210010-00011. PMID 12353947. S2CID 35752032.
^ ^a ^b Willis, BH (2012). "Evidence that disease prevalence may affect the performance of diagnostic tests with an implicit threshold: a cross sectional study". BMJ Open. 2 (1): e000746. doi:10.1136/bmjopen-2011-000746. PMC 3274715. PMID 22307105.
^ Leeflang MM, Bossuyt PM, Irwig L., Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis, J Clin Epidemiol. 2009 Jan;62(1) 5–12.
^ Goehring C, Perrier A, Morabia A (2004). "Spectrum bias: a quantitative and graphical analysis of the variability of medical diagnostic test performance". Statistics in Medicine. 23 (1): 125–35. doi:10.1002/sim.1591. PMID 14695644. S2CID 24636826.
^ Fletcher RH (1986). "Carcinoembryonic antigen". Ann. Intern. Med. 104 (1): 66–73. doi:10.7326/0003-4819-104-1-66. PMID 3510056.
^ Lachs MS, Nachamkin I, Edelstein PH, Goldman J, Feinstein AR, Schwartz JS (1992). "Spectrum bias in the evaluation of diagnostic tests: lessons from the rapid dipstick test for urinary tract infection". Ann. Intern. Med. 117 (2): 135–40. doi:10.7326/0003-4819-117-2-135. PMID 1605428. S2CID 25381473.
^ Rutjes AWS, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PMM, Case-control and two-gate designs in diagnostic accuracy studies, Clin Chem 2005;51(8):1335–41.

[pmid692598-1] Ransohoff DF, Feinstein AR (1978). "Problems of spectrum and bias in evaluating the efficacy of diagnostic tests". N. Engl. J. Med. 299 (17): 926–30. doi:10.1056/NEJM197810262991705. PMID 692598.

[pmid18765409-2] Willis BH (2008). "Spectrum bias – why clinicians need to be cautious when applying diagnostic test studies". Family Practice. 25 (5): 390–96. doi:10.1093/fampra/cmn051. PMID 18765409.

[pmid12353947-3] Mulherin SA, Miller WC (2002). "Spectrum bias or spectrum effect? Subgroup variation in diagnostic test evaluation" (PDF). Ann. Intern. Med. 137 (7): 598–602. doi:10.7326/0003-4819-137-7-200210010-00011. PMID 12353947. S2CID 35752032.

[pmid22307105-4] Willis, BH (2012). "Evidence that disease prevalence may affect the performance of diagnostic tests with an implicit threshold: a cross sectional study". BMJ Open. 2 (1): e000746. doi:10.1136/bmjopen-2011-000746. PMC 3274715. PMID 22307105.

[pmid18778913-5] Leeflang MM, Bossuyt PM, Irwig L., Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis, J Clin Epidemiol. 2009 Jan;62(1) 5–12.

[pmid14695644-6] Goehring C, Perrier A, Morabia A (2004). "Spectrum bias: a quantitative and graphical analysis of the variability of medical diagnostic test performance". Statistics in Medicine. 23 (1): 125–35. doi:10.1002/sim.1591. PMID 14695644. S2CID 24636826.

[pmid3510056-7] Fletcher RH (1986). "Carcinoembryonic antigen". Ann. Intern. Med. 104 (1): 66–73. doi:10.7326/0003-4819-104-1-66. PMID 3510056.

[pmid1605428-8] Lachs MS, Nachamkin I, Edelstein PH, Goldman J, Feinstein AR, Schwartz JS (1992). "Spectrum bias in the evaluation of diagnostic tests: lessons from the rapid dipstick test for urinary tract infection". Ann. Intern. Med. 117 (2): 135–40. doi:10.7326/0003-4819-117-2-135. PMID 1605428. S2CID 25381473.

[9] Rutjes AWS, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PMM, Case-control and two-gate designs in diagnostic accuracy studies, Clin Chem 2005;51(8):1335–41.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Biases
Cognitive biases	Acquiescence Ambiguity Affinity Anchoring Attentional Attribution Actor–observer Correspondence Authority Automation Availability Mean world Belief Blind spot Choice-supportive Commitment Confirmation Selective perception Compassion fade Congruence Cultural Declinism Distinction Dunning–Kruger Egocentric Curse of knowledge Emotional Extrinsic incentives Fading affect Framing Frequency Frog pond effect Halo effect Hindsight Horn effect Hostile attribution Impact Implicit In-group Intentionality Illusion of transparency Mean world syndrome Mere-exposure effect Narrative Negativity Normalcy Omission Optimism Out-group homogeneity Outcome Overton window Precision Present Pro-innovation Proximity Response Restraint Self-serving Social comparison Social influence bias Spotlight Status quo Substitution Time-saving Trait ascription Turkey illusion von Restorff effect Zero-risk In animals
Statistical biases	Estimator Forecast Healthy user Information Psychological Lead time Length time Non-response Observer Omitted-variable Participation Recall Sampling Selection Self-selection Social desirability Spectrum Survivorship Systematic error Systemic Verification Wet
Other biases	Academic Basking in reflected glory Déformation professionnelle Funding FUTON Inductive Infrastructure Inherent In education Liking gap Media False balance Vietnam War Norway South Asia Sweden United States Arab–Israeli conflict Ukraine Net Political bias Publication Reporting White hat
Bias reduction	Cognitive bias mitigation Debiasing Heuristics in judgment and decision-making
Lists: General Memory

See also

References