Skip to content

Track 03

SVM and Advanced Clustering

This track sharpens geometry judgment: when a margin-based classifier is cleaner than a probability model, when a kernel is worth the added flexibility, and when a cluster method matches the shape of the data instead of forcing the wrong assumption.

Primary Goal

Choose By Geometry

The point is to match method to boundary shape, density structure, and noise behavior instead of reaching for algorithms by habit.

Best For

Classical Model Judgment

Use this track when your evaluation discipline is stable and the next question is what shape the data is really exposing.

Exit Rule

One Defensible Geometry Story

You are done when you can explain why the chosen method fits the shape and why the alternatives are weaker.

Use This Track When

  • validation and baselines already feel under control
  • you need clearer judgment about linear versus nonlinear boundaries
  • you want to compare centroid, density, and hierarchical clustering honestly

What This Track Is Training

This track trains one practical rule:

  • let the geometry decide the method

That means the learner should be able to answer:

  • is the boundary mostly linear
  • do you need a calibrated probability or a margin
  • are the clusters compact, irregular, or noisy
  • is the plot only for inspection, or is it being overclaimed

First Session

Use this order:

  1. SVM Margins and Kernels
  2. Clustering and Low-Dimensional Views
  3. run academy/examples/classical-ml-recipes/svm_margin_demo.py
  4. run academy/examples/classical-ml-recipes/clustering_views_demo.py
  5. write one note on what the data shape suggests and what it does not prove

Full Track Loop

For the complete workflow:

  1. review the SVM and clustering topics
  2. run the linear-versus-kernel and clustering examples
  3. complete the full lab in academy/labs/svm-and-advanced-clustering/
  4. finish the matching exercises in academy/exercises/svm-and-advanced-clustering/
  5. keep one note naming the chosen method, the shape it matches, and the main failure mode

What To Inspect

By the end of the track, the learner should have inspected:

  • linear versus nonlinear boundary behavior
  • whether probability or margin matters more for the task
  • cluster behavior under centroid and density assumptions
  • one low-dimensional view used as inspection rather than proof

Common Failure Modes

  • treating SVM margins like calibrated probabilities
  • picking a kernel before testing a simpler linear boundary
  • forcing KMeans on irregular or noisy geometry
  • treating t-SNE or PCA as ground truth
  • choosing the number of clusters from plot aesthetics alone

Exit Standard

Before leaving this track, the learner should be able to:

  • explain why a margin method or kernel was worth using
  • describe which clustering assumption matched the data
  • say what the low-dimensional plot helped inspect
  • name one reason the most attractive plot could still be misleading

That is enough to move into PyTorch Training Recipes or a more advanced unsupervised workflow later.