Study Design & ProbabilityApril 18, 20265 min read

Q-Bank Breakdown: Prevalence effect on PPV/NPV — Why Every Answer Choice Matters

Clinical vignette on Prevalence effect on PPV/NPV. Explain correct answer, then systematically address each distractor. Tag: Biostatistics > Study Design & Probability.

Prevalence is the silent “third variable” in every test question: it doesn’t change the test itself (sensitivity/specificity), but it can completely flip what a positive or negative result means for your patient. If you’ve ever missed a PPV/NPV question because the stem felt “too clinical,” this post is your reset.

Tag: Biostatistics > Study Design & Probability


The clinical vignette (Q-bank style)

A hospital introduces a rapid PCR screening test for Disease X.

  • Sensitivity: 90%
  • Specificity: 90%

You are told the test characteristics are stable across populations.

Two groups are screened:

  • Group A (general population): prevalence = 1%
  • Group B (high-risk clinic): prevalence = 20%

Question: Compared with Group A, which of the following is true in Group B?

A. Sensitivity increases
B. Specificity decreases
C. Positive predictive value increases
D. Negative predictive value increases
E. False-positive rate decreases


Step-by-step: Why the correct answer is C. PPV increases

Key principle

  • Sensitivity and specificity do not depend on prevalence.
  • PPV and NPV do depend on prevalence.

The intuition

When prevalence rises, a random positive test is more likely to be a true positive (because there are simply more true cases floating around). That pushes PPV up.


Prove it quickly with a 2×2 table (high-yield move)

Assume 10,000 people in each group.

Group A: prevalence 1%

  • Diseased: 100
  • Not diseased: 9,900

Using Se = 90%, Sp = 90%:

Group ADisease +Disease −Total
Test +TP = 90FP = 9901080
Test −FN = 10TN = 89108920
Total100990010000
  • PPV =TPTP+FP=9090+990=9010808.3%= \frac{TP}{TP+FP} = \frac{90}{90+990} = \frac{90}{1080} \approx 8.3\%
  • NPV =TNTN+FN=89108910+1099.9%= \frac{TN}{TN+FN} = \frac{8910}{8910+10} \approx 99.9\%

Group B: prevalence 20%

  • Diseased: 2,000
  • Not diseased: 8,000
Group BDisease +Disease −Total
Test +TP = 1800FP = 8002600
Test −FN = 200TN = 72007400
Total2000800010000
  • PPV =18001800+800=1800260069.2%= \frac{1800}{1800+800} = \frac{1800}{2600} \approx 69.2\%
  • NPV =72007200+200=7200740097.3%= \frac{7200}{7200+200} = \frac{7200}{7400} \approx 97.3\%

What changed?

  • PPV skyrocketed (8.3% → 69.2%) with higher prevalence.
  • NPV fell (99.9% → 97.3%) with higher prevalence.

So the correct answer is C. Positive predictive value increases.


Why every other answer choice is wrong (systematic distractor breakdown)

A. Sensitivity increases — Wrong

  • Sensitivity is P(Test+Disease)P(\text{Test+} \mid \text{Disease}).
  • It’s a property of the test among people who have the disease. Changing how common the disease is in the population doesn’t change test performance within diseased individuals.
  • In both groups: sensitivity stays 90%.

USMLE trap: Stems say “high-risk clinic” and students assume the test “works better.” That’s a prevalence change, not a test-tech change.


B. Specificity decreases — Wrong

  • Specificity is P(Test−No disease)P(\text{Test−} \mid \text{No disease}).
  • It’s measured among people without the disease; prevalence doesn’t alter it.
  • In both groups: specificity stays 90%.

High-yield: If a question implies Sp changed, they must be changing the test threshold/technology or introducing bias (e.g., verification bias), not merely changing prevalence.


D. Negative predictive value increases — Wrong

  • NPV is P(No diseaseTest−)P(\text{No disease} \mid \text{Test−}).
  • As prevalence increases, negatives are more likely to be false negatives, so NPV tends to decrease.

Rule to memorize:

  • Prevalence ↑ → PPV ↑, NPV ↓
  • Prevalence ↓ → PPV ↓, NPV ↑

E. False-positive rate decreases — Wrong

  • False-positive rate (FPR) is 1specificity=P(Test+No disease)1-\text{specificity} = P(\text{Test+} \mid \text{No disease}).
  • Since specificity is unchanged, FPR is unchanged.

Common confusion: Students mix up:

  • FPR (a test characteristic) vs
  • the number of false positives (which can change with prevalence, because the size of the non-diseased pool changes)

In our example:

  • Group A FP = 990
  • Group B FP = 800
    The count of FP changed, but the rate among non-diseased stayed:
    \text{FPR} = 1 - 0.90 = 0.10 \; \text{(10% in both groups)}

The two “universals” you should always check in stems

1) Are they changing prevalence or threshold?

  • Prevalence change → affects PPV/NPV only.
  • Threshold change (moving cutoff) → trades off sensitivity vs specificity.

Quick threshold tradeoff:

  • Lower cutoff (call more people “positive”) → Se ↑, Sp ↓
  • Higher cutoff (stricter positive definition) → Se ↓, Sp ↑

2) Are they asking for probability “given disease” vs “given test result”?

  • “Given disease status” → sensitivity/specificity
  • “Given test result” → PPV/NPV

High-yield summary table (memorize this)

QuantityDefinitionDepends on prevalence?Goes up when prevalence rises?
SensitivityP(T+D+)P(T+ \mid D+)NoNo
SpecificityP(TD)P(T- \mid D-)NoNo
PPVP(D+T+)P(D+ \mid T+)YesYes
NPVP(DT)P(D- \mid T-)YesNo (usually decreases)

Test-day heuristics (fast and reliable)

  • If prevalence is low, most positives are false positivesPPV is low.
  • If prevalence is high, most negatives are false negativesNPV is lower.
  • Sensitivity/specificity are “inside the lab”; PPV/NPV are “at the bedside.”

One-liner takeaway

Prevalence doesn’t change the test; it changes what the test result means. In higher-prevalence settings, a positive result is more believable → PPV increases.