Track 02

scikit-learn Validation and Tuning

This track turns evaluation discipline into code: fix the split, choose the metric, compare against a baseline, tune inside the boundary, and reject flattering improvements that do not survive honest validation.

Open Validation Topics Run Fast Examples Use The Study Plan

Primary Goal

Trust The Experiment

The point is not to make a score rise. The point is to know whether the score is worth believing.

Best For

Baseline To Selection Discipline

Use this track when the next bottleneck is no longer data handling but model comparison, cross-validation, and leakage control.

Exit Rule

One Honest Validation Story

You are done when you can defend the split, metric, baseline, and tuning path in one short note.

Use This Track When¶

the first tabular workflow is already stable
you are ready to compare models honestly
you need cross-validation, tuning, calibration, and leakage checks to feel mechanical

What This Track Is Training¶

This track trains one practical rule:

tune only inside a boundary you would still trust after the result looks good

That means the learner should be able to keep these explicit:

the prediction unit
the split rule
the primary metric
the baseline
the leakage risk

First Session¶

Use this order:

Honest Splits and Baselines
Leakage Patterns
run academy/.venv/bin/python academy/examples/validation-baseline-comparison/baseline_comparison.py
run academy/.venv/bin/python academy/examples/classical-ml-recipes/cross_validation_demo.py
write one note on what would invalidate the experiment

Full Track Loop¶

For the complete workflow:

review the validation topics in order
run the baseline, cross-validation, tuning, and calibration examples
run academy/.venv/bin/python academy/labs/sklearn-validation-and-tuning/src/validation_tuning_workflow.py
finish the matching exercises in academy/exercises/sklearn-validation-and-tuning/
keep one short experiment note with the split, metric, baseline, and selected model

What To Inspect¶

By the end of the track, the learner should have inspected:

baseline versus learned model
fold mean and spread
one tuning table
one calibration or threshold view
one leakage suspicion that was tested directly

Common Failure Modes¶

peeking at the test set during selection
changing the split and the model at the same time
comparing metrics that do not match the task cost
claiming a tuning gain without showing baseline and fold spread
hiding preprocessing inside the wrong boundary

Exit Standard¶

Before leaving this track, the learner should be able to:

defend the split rule
explain why the chosen metric matches the task
compare a baseline against the selected model honestly
name one leakage pattern that was avoided
say what result would still count as untrustworthy

That is enough to move into SVM and Advanced Clustering or the first deep-learning track.