Clinic 08
Threshold Under Asymmetric Cost
The default 0.5 threshold looks fine on accuracy. But missing a positive costs ten times more than a false alarm. The right threshold is not the obvious one.
Situation
Accuracy Lies When Costs Differ
The default threshold maximizes accuracy but hides the expensive misses. The cost-aware threshold looks worse on paper but saves money.
Your Job
Choose The Operating Point
Pick the threshold, compute the cost, and explain why the accuracy-optimal point is the wrong choice.
Bad Habit To Avoid
Best Accuracy = Best Threshold
If the decision ignores the cost ratio, the threshold is optimizing the wrong thing.
Situation¶
You are building a fraud detection system. The business rule is clear:
- missing a fraud case (false negative) costs $10,000 in chargebacks
- investigating a legitimate transaction (false positive) costs $100 in review time
- the cost ratio is 100:1
The model is calibrated. You need to choose the operating threshold.
Artifact Packet¶
Read this packet before you decide:
| threshold | precision | recall | FPR | accuracy | false negatives (per 1000) | false positives (per 1000) | total cost (per 1000) |
|---|---|---|---|---|---|---|---|
| 0.50 | 0.82 | 0.61 | 0.013 | 0.971 | 19.5 | 12.7 | $196,270 |
| 0.30 | 0.64 | 0.79 | 0.044 | 0.948 | 10.5 | 43.2 | $109,320 |
| 0.15 | 0.41 | 0.91 | 0.129 | 0.867 | 4.5 | 126.3 | $57,630 |
| 0.10 | 0.29 | 0.95 | 0.231 | 0.769 | 2.5 | 228.1 | $47,810 |
| 0.05 | 0.16 | 0.98 | 0.508 | 0.498 | 1.0 | 500.4 | $60,040 |
Base rate: 5% fraud (50 cases per 1000 transactions).
Decision Prompt¶
Write the note before you open the reveal.
Your note should answer:
- Which threshold minimizes total cost?
- Why is the accuracy-maximizing threshold (0.50) the worst choice here?
- What happens at threshold 0.05 — why does the cost go back up?
- What business change would shift your answer toward a higher threshold?
Keep the note short. Four to six sentences is enough.
Strong Reasoning Looks Like¶
- it picks 0.10 as the cost-minimizing threshold
- it explains that at 0.50, 19.5 missed frauds at $10,000 each dominate the total cost
- it notices the cost curve is U-shaped: going below 0.10 increases false positives enough to raise total cost again
- it names a scenario where the cost ratio changes (e.g., cheaper chargebacks, more expensive review) and connects it to threshold movement
- it separates accuracy from cost-effectiveness clearly
Common Wrong Moves¶
- choosing 0.50 because it has the best accuracy
- choosing 0.05 because "catch everything" sounds safe without checking the cost
- choosing 0.15 as a compromise without computing the actual cost difference
- ignoring the false positive cost entirely
- not noticing the U-shaped cost curve
Run The Clinic In Browser¶
Validate Your Decision In Browser¶
Reference Reveal¶
Open only after you write the note
The reference choice is: - `selected_threshold = 0.10` - `reasoning = minimum total cost at $47,810 per 1000 transactions` Why: - the 0.50 threshold has the best accuracy (0.971) but the worst total cost ($196,270) because each missed fraud costs $10,000 - the 0.10 threshold catches 95% of fraud while keeping false positive costs manageable - at 0.05, the false positive volume (500+ per 1000) pushes the cost back up despite catching 98% of fraud - the cost-optimal point is where the marginal cost of one more false positive equals the marginal savings from one fewer false negative If the cost ratio drops (e.g., chargebacks cost $1,000 instead of $10,000), the optimal threshold moves higher. If review becomes cheaper (e.g., automated), it moves lower. The practical lesson: when error costs are asymmetric, accuracy is the wrong optimization target. Threshold selection must be driven by the cost structure.What To Do Next¶
After this clinic:
- open Calibration and Thresholds
- run the matching threshold demo example
- use Imbalanced Triage and Review Budgets for the full cost-aware workflow