The Adoption Curve
False positive rate → Developer response
0–15%: Developers read every finding carefully
15–30%: Developers focus on Critical/High, skim Medium
30–60%: Developers scan briefly, mostly dismiss
60%+: Developers stop opening review comments entirely
At 70% dismissal, the 30% of real findings are invisible in the noise.
Measuring False Positives
# Track every developer action on findings
def record_finding_action(finding_id: str, action: str, reason: str = None):
"""
action: 'fixed' | 'dismissed' | 'deferred'
reason: free text if dismissed ('false positive', 'known issue', 'intentional', etc.)
"""
db.insert("finding_actions", {
"finding_id": finding_id,
"action": action,
"reason": reason,
"reviewer_id": current_user(),
"timestamp": utcnow()
})
# Weekly report
def fp_report():
return db.query("""
SELECT fa.rule_category,
COUNT(*) FILTER (WHERE action='fixed') as fixed,
COUNT(*) FILTER (WHERE action='dismissed') as dismissed,
COUNT(*) FILTER (WHERE action='deferred') as deferred,
ROUND(
COUNT(*) FILTER (WHERE action='dismissed') * 100.0 / COUNT(*), 1
) as fp_rate_pct
FROM finding_actions fa
JOIN findings f ON f.id = fa.finding_id
WHERE fa.timestamp > NOW() - INTERVAL '7 days'
GROUP BY fa.rule_category
ORDER BY fp_rate_pct DESC
""")
Diagnosing and Fixing by Category
# Example: "missing_error_handling" has 68% dismissal rate
# Root cause analysis:
# - FastAPI middleware catches all uncaught exceptions
# - Claude doesn't know this — flags every route without explicit try/catch
# Fix: add codebase context to criteria
additional_context = """
Important context about this codebase:
- FastAPI middleware at src/middleware/error_handler.py catches ALL uncaught exceptions
- You do NOT need to flag missing try/catch in route handlers that delegate to services
- Only flag missing error handling where: (1) the function makes a network call directly
AND (2) no error boundary is visible in the same file or its imports
"""
Key Takeaways
- Signal-to-noise ratio matters more than total findings
- Track dismissals as your false positive measurement signal
- >30% dismissal in any category → that category’s criteria needs work
- Root cause: vague criteria, missing context, or wrong scope
- Iterate systematically — find the pattern, fix the criteria, measure again