Power and sample size questions are sneaky because they’re not really about memorizing one definition—they’re about knowing which knob to turn (effect size, , power, variability, allocation ratio) and predicting what happens to Type I/II error and confidence intervals. On test day, every answer choice is basically a different “knob,” and your job is to recognize what it does.
Tag: Biostatistics > Study Design & Probability
The Vignette (Q-bank style)
A randomized controlled trial tests whether a new antiplatelet drug reduces 30-day mortality after acute MI compared with standard therapy. Investigators expect mortality to decrease from 10% to 8% (absolute risk reduction 2%). They plan a two-sided test with and want 80% power. However, after enrolling 600 patients total, an interim analysis shows no statistically significant difference. The investigators suspect the trial was underpowered.
Which of the following changes would most increase the power of this study (while keeping )?
A. Decrease the significance level to
B. Increase the sample size
C. Use a two-sided test instead of a one-sided test
D. Increase the variability (standard deviation) of the outcome
E. Decrease the effect size the study is designed to detect
Step 1/2 Mindset: What “Power” Really Means
Power is the probability of detecting a true effect when it exists:
- Power =
- = probability of Type II error (false negative)
Power increases when you:
- Increase sample size ()
- Increase effect size (bigger difference between groups)
- Increase (more tolerant of Type I error)
- Decrease variability (less noise)
- Use a one-sided test (if justified and direction is pre-specified)
A classic memory hook: Power goes up with “more signal” or “less noise.”
Correct Answer: B. Increase the sample size
If you want more power without changing , the most straightforward fix is increasing .
Why increasing increases power
Bigger sample sizes:
- Reduce standard error (tighter sampling distribution)
- Narrow confidence intervals
- Make it easier to detect a true difference (even if modest)
For many common tests, standard error scales like:
So to meaningfully increase power, you often need a substantial increase in sample size (doubling does not double power, but it helps).
Now Eliminate the Distractors (Why Every Answer Choice Matters)
A. Decrease the significance level to ❌
Lowering makes it harder to call a result significant.
- → Type I error decreases
- But the threshold becomes more stringent → increases → power decreases
High-yield: and trade off (holding constant). If you demand stronger evidence (smaller ), you’ll miss more true effects.
C. Use a two-sided test instead of a one-sided test ❌
This is backwards: two-sided tests reduce power compared to one-sided tests (all else equal).
- Two-sided splits across both tails (e.g., 0.025 each)
- Requires a more extreme test statistic to reject
High-yield nuance: A one-sided test can increase power only if:
- The direction is pre-specified before data collection
- Effects in the opposite direction are clinically irrelevant or implausible
Otherwise, one-sided testing is considered methodologically shady.
D. Increase the variability (standard deviation) of the outcome ❌
More variability = more noise = harder to see signal.
- Variability → standard error
- Confidence intervals widen
- Test statistic shrinks (on average) → power decreases
USMLE tie-in: Anything that increases “spread” (heterogeneous population, unreliable measurement, poor adherence) generally reduces power.
E. Decrease the effect size the study is designed to detect ❌
Smaller effect sizes are harder to detect.
- Effect size → groups become more similar
- You need a larger sample to detect the smaller difference
- If stays fixed, power decreases
High-yield framing:
- If the true effect is small, you can still get high power—but only with big .
Rapid-Fire High-Yield Table: What Happens to Power?
| Change | Type I error () | Type II error () | Power () |
|---|---|---|---|
| Increase | — | ↓ | ↑ |
| Increase | ↑ | ↓ | ↑ |
| Decrease | ↓ | ↑ | ↓ |
| Increase effect size | — | ↓ | ↑ |
| Decrease effect size | — | ↑ | ↓ |
| Increase variability (SD) | — | ↑ | ↓ |
| Decrease variability (SD) | — | ↓ | ↑ |
| One-sided vs two-sided (same ) | — | ↓ | ↑ (one-sided) |
(“—” = not directly changed by that manipulation)
How This Shows Up on USMLE: The Classic Traps
Trap 1: Confusing power with p-value
- Power is planned before the study (design stage).
- p-value is calculated after data collection (analysis stage).
Low power often leads to:
- “Negative study” even when a true effect exists
- Wider CIs that include clinically important effects
Trap 2: Thinking a nonsignificant result means “no effect”
A nonsignificant result may reflect:
- True lack of effect or
- Underpowered study (small , small effect size, high variability)
High-yield language: “The study may have failed to detect a difference” ≠ “There is no difference.”
Trap 3: Mixing up absolute vs relative effect size
Sample size needs balloon when:
- Baseline event rate is low
- Absolute risk reduction is small (e.g., 2%)
In the vignette, 10% → 8% is a modest absolute change—often requiring a large .
A Quick “Knob-Turning” Strategy for Answer Choices
When you see power/sample size answers, translate each option into one of these knobs:
- (sample size)
- (significance threshold)
- Effect size (difference between groups)
- Variability (SD / measurement noise)
- One- vs two-sided hypothesis testing
Then ask: does this create more signal, less noise, or easier rejection of ?
Takeaway Cheat Sheet (What You Should Remember)
- Power =
- To increase power (with fixed): increase is the cleanest, most defensible move
- Lower → lower power
- Two-sided tests → lower power than one-sided (given same )
- Higher variability → lower power
- Smaller effect size → lower power (unless you increase )