Every metric is an estimate; confidence intervals

Advanced Python for Data Science

Created by Best · 24.06.2026 at 14:03 UTC

Every metric is an estimate from a finite sample, and real decisions hinge on whether a difference is signal or noise. The first habit is putting an error bar on a measured rate. For a proportion p from n items, the rough 95% confidence interval is

p  +/-  1.96 * sqrt( p * (1 - p) / n )

The headline to internalise is that the margin of error shrinks only as 1/sqrt(n): about +/-10% at n = 100, +/-5% at n = 400, +/-3% at n = 1000. So a metric that "moved from 92% to 88%" on 50 items each is almost certainly noise, not a regression.

The normal-approximation (Wald) interval is a direct calculation: estimate the proportion, its standard error, then go z standard errors either side.

phat = k / n
se = (phat * (1 - phat) / n) ** 0.5
low, high = phat - z * se, phat + z * se

This is fine for moderate p and n; for very small samples or proportions near 0 or 1, use a Wilson or Clopper-Pearson interval instead, which behave better at the extremes.
comparisons”.*

University approvals: 0

Related cards

Builds on Thresholds, accuracy's lie, and computing metrics · Python for Data Science

Next A/B testing, sample ratio mismatch, and multiple comparisons · Python for Data Science

Tasks

Question 1

You measured precision = 90% on a sample of n = 100 labeled items. Using margin ≈ 1/sqrt(n), the rough 95% margin of error is about:

±1%

±10%

±30%

±0.1%

Question 2

Read three numbers from stdin (one per line): successes k, trials n, and z. Print the Wald confidence interval bounds for the proportion k/n as low high, each rounded to 3 decimals. low = phat - zsqrt(phat(1-phat)/n), high = phat + ... .

Example input:

90
100
1.96

Expected output:

0.841 0.959

Runtime output (stdout/stderr)

3 test cases will be used for grading

Run checks runtime behavior only. Final correctness is evaluated when you submit.

Card Info

Topic: Python for Data Science
Difficulty: Advanced
Completed: 0 users

Creator

Best

BestBuddy