Skip to content

Track 02

scikit-learn Validation and Tuning

This track turns evaluation discipline into code: fix the split, choose the metric, compare against a baseline, tune inside the boundary, and reject flattering improvements that do not survive honest validation.

Primary Goal

Trust The Experiment

The point is not to make a score rise. The point is to know whether the score is worth believing.

Best For

Baseline To Selection Discipline

Use this track when the next bottleneck is no longer data handling but model comparison, cross-validation, and leakage control.

Exit Rule

One Honest Validation Story

You are done when you can defend the split, metric, baseline, and tuning path in one short note.

Use This Track When

  • the first tabular workflow is already stable
  • you are ready to compare models honestly
  • you need cross-validation, tuning, calibration, and leakage checks to feel mechanical

What This Track Is Training

This track trains one practical rule:

  • tune only inside a boundary you would still trust after the result looks good

That means the learner should be able to keep these explicit:

  • the prediction unit
  • the split rule
  • the primary metric
  • the baseline
  • the leakage risk

First Session

Use this order:

  1. Honest Splits and Baselines
  2. Leakage Patterns
  3. run academy/.venv/bin/python academy/examples/validation-baseline-comparison/baseline_comparison.py
  4. run academy/.venv/bin/python academy/examples/classical-ml-recipes/cross_validation_demo.py
  5. write one note on what would invalidate the experiment

Full Track Loop

For the complete workflow:

  1. review the validation topics in order
  2. run the baseline, cross-validation, tuning, and calibration examples
  3. run academy/.venv/bin/python academy/labs/sklearn-validation-and-tuning/src/validation_tuning_workflow.py
  4. finish the matching exercises in academy/exercises/sklearn-validation-and-tuning/
  5. keep one short experiment note with the split, metric, baseline, and selected model

What To Inspect

By the end of the track, the learner should have inspected:

  • baseline versus learned model
  • fold mean and spread
  • one tuning table
  • one calibration or threshold view
  • one leakage suspicion that was tested directly

Common Failure Modes

  • peeking at the test set during selection
  • changing the split and the model at the same time
  • comparing metrics that do not match the task cost
  • claiming a tuning gain without showing baseline and fold spread
  • hiding preprocessing inside the wrong boundary

Exit Standard

Before leaving this track, the learner should be able to:

  • defend the split rule
  • explain why the chosen metric matches the task
  • compare a baseline against the selected model honestly
  • name one leakage pattern that was avoided
  • say what result would still count as untrustworthy

That is enough to move into SVM and Advanced Clustering or the first deep-learning track.