Q-Bank Breakdown: PPV & NPV — Why Every Answer Choice Matters | StepGenie Blog

You just missed a question on PPV/NPV, and the explanation says “depends on prevalence” — but the distractors all sounded kind of true. That’s exactly why these biostats items are sneaky: they’re not testing whether you can recite definitions, they’re testing whether you know which probability you’re being asked for and which variables actually move it.

Tag: Biostatistics > Study Design & Probability

The Vignette (Q-bank style)

A new rapid antigen test is used to screen for Disease X in an outpatient clinic.

Sensitivity: 90%
Specificity: 90%
Prevalence of Disease X in this clinic population: 1%

A patient’s test returns positive.

Question: What is the approximate positive predictive value (PPV) of this test in this population?

Answer choices

A. 9%
B. 50%
C. 90%
D. 99%
E. PPV cannot be determined without knowing the sample size

Step 1: Translate the Ask (what are they really asking?)

They gave you:

sensitivity and specificity (test characteristics)
prevalence (pretest probability in the population)
a positive test result

They are asking:

PPV = $P(\text{Disease} \mid \text{Test} +)$

This is a post-test probability conditioned on a positive test.

Step 2: Solve It Fast (2×2 table method)

Assume a population of 10,000 (makes 1% clean).

Prevalence 1% → 100 truly diseased, 9,900 not diseased

Now apply sens/spec:

Sensitivity 90% → True positives (TP) = 90% of 100 = 90
False negatives (FN) = 10
Specificity 90% → True negatives (TN) = 90% of 9,900 = 8,910
False positives (FP) = 990

Now compute PPV:

PPV = \frac{TP}{TP + FP} = \frac{90}{90 + 990} = \frac{90}{1080} \approx 0.083 = 8.3\%

So the best answer is A. 9% (rounding).

Why This Happens: The “False Positive Avalanche” in Low Prevalence

Even with a “pretty good” specificity (90%), if the disease is rare, the non-diseased group is massive, so a small false-positive rate creates many false positives.

High-yield takeaway:

Low prevalence → PPV drops, NPV rises (holding sens/spec constant)
High prevalence → PPV rises, NPV drops

The High-Yield Formulas (know what moves what)

Metric	Meaning	Formula
Sensitivity	$P(T+ \mid D)$	$\frac{TP}{TP+FN}$
Specificity	$P(T- \mid \neg D)$	$\frac{TN}{TN+FP}$
PPV	$P(D \mid T+)$	$\frac{TP}{TP+FP}$
NPV	$P(\neg D \mid T-)$	$\frac{TN}{TN+FN}$

Also useful:

Prevalence (pretest probability): $\frac{TP+FN}{\text{Total}}$
False positive rate: $1-\text{specificity}$
False negative rate: $1-\text{sensitivity}$

Distractor Autopsy: Why Every Wrong Answer Is Wrong (and tempting)

A. 9% — Correct

This reflects the low prevalence. With many more healthy people than sick people, most positives are false positives, even when specificity is decent.

B. 50% — Tempting if you “average” sensitivity and specificity

Students often see 90%/90% and assume “coin-flip-ish errors don’t happen,” or they mentally anchor to 50% as a generic probability.

Why it’s wrong:

PPV is not the average of sensitivity and specificity.
PPV depends strongly on prevalence.

Quick check:

We found FP (990) massively outweigh TP (90) → PPV can’t be anywhere near 50%.

C. 90% — Classic confusion: mixing up PPV with sensitivity

This choice is attractive because sensitivity is 90%, and people incorrectly think:

“If test is positive, 90% chance disease.”

But sensitivity is:

$P(T+ \mid D)$ (probability test is positive given disease)

PPV is:

$P(D \mid T+)$ (probability of disease given positive test)

High-yield phrase:

Sensitivity and specificity condition on disease status.
PPV and NPV condition on test result.

D. 99% — Classic confusion: mixing up PPV with specificity (or NPV)

Specificity is 90%, not 99%, but 99% often appears as a “very certain” distractor when prevalence is low.

What 99% does resemble here:

With low prevalence, NPV can get very high (often >99%) if sensitivity is decent.

If the question had asked NPV, you’d compute:

NPV = \frac{TN}{TN+FN} = \frac{8910}{8910+10} \approx 99.9\%

So 99% is wrong for PPV, but it’s a clue that you might be mixing up which post-test probability they want.

E. PPV cannot be determined without knowing the sample size — Wrong, but reveals a key concept

Sample size is irrelevant to PPV if you know the prevalence, sensitivity, and specificity. You can pick any convenient denominator (like 10,000) because PPV is a ratio.

What you do need:

Prevalence (or pretest probability)
Sensitivity and specificity

High-yield caveat:

If prevalence is not given, then yes — you can’t compute PPV/NPV from sens/spec alone.

Study Design & Probability Tie-In (how USMLE likes to frame this)

USMLE often embeds PPV/NPV in real-world decision-making:

Screening vs diagnostic testing

Screening tests are frequently used in low-prevalence populations → PPV can be surprisingly low.
A positive screening test often requires a confirmatory test with higher specificity (to reduce false positives).

Bayes principle (conceptual)

You’re updating from pretest probability (prevalence) to post-test probability (PPV/NPV). You don’t need full Bayes math on exam day if you can do the 2×2 table quickly.

Rapid-Fire High-Yield Rules (memorize these)

Prevalence increases → PPV increases, NPV decreases
Prevalence decreases → PPV decreases, NPV increases
Sensitivity rules out (SnNout): highly sensitive test, negative result helps rule out disease
Specificity rules in (SpPin): highly specific test, positive result helps rule in disease
False positives explode when disease is rare, even with “good” specificity

Your Test-Day Playbook (30 seconds)

Identify whether you need $P(D \mid T+)$ (PPV) or $P(\neg D \mid T-)$ (NPV).
Choose a fake population size (usually 10,000).
Apply prevalence → split diseased vs not diseased.
Apply sensitivity/specificity → fill TP, FP, TN, FN.
Compute the asked ratio.

That’s it — and it immunizes you against almost every distractor.