Lv.1 0 XP

Stratified Sampling — Validating System Performance Without Reviewing Everything

Core 6 min +25 XP
💡
THE ANALOGY

A food safety inspector who doesn't test every item in a 10,000-item shipment. They take a statistically valid sample from each category — fresh, frozen, canned — and test that. If any category fails, the whole category is flagged. Stratified sampling gives confidence without reviewing everything.

⚠️ EXAM TRAP — The Wrong Answer People Choose

Random sampling instead of stratified sampling. Random sampling will under-represent rare categories (like low-confidence extractions or unusual document types) and give false confidence. Stratified sampling ensures every category is represented.

KEY POINTS
1 Stratified sampling: sample proportionally from each stratum (confidence tier, document type, error category).
2 Sample size per stratum: minimum 30-50 examples for statistical significance.
3 Strata for extraction systems: confidence tier, document type, amount range, vendor category.
4 Review frequency: high-error strata need more frequent sampling than low-error strata.
5 Sampling drives calibration: stratified sample results feed back to confidence routing thresholds.