Skip to content

Track 12

Advanced Unsupervised and Manifold Workflows

This track turns unsupervised geometry into a route: inspect whether the structure is compact, curved, noisy, or graph-like, then choose the method that matches that shape instead of trusting the prettiest plot.

Primary Goal

Match Geometry To Method

The real skill is deciding when centroid, density, graph, or manifold thinking is the right lens for the data.

Best For

After Basic Clustering

Use this when KMeans versus DBSCAN is no longer enough and you need stronger judgment about non-convex shapes, noise, and embedding claims.

Exit Rule

Leave With One Defensible Method Call

You are done when you can justify a geometry-matching method with stability and non-visual evidence, not only with one nice view.

Use This Track When

  • compact centroid assumptions already feel too blunt
  • the data may contain curved groups, local connectivity, or meaningful noise
  • you need to compare KMeans, spectral clustering, DBSCAN, and agglomerative clustering honestly
  • you want to use Isomap, MDS, or t-SNE as inspection tools without overclaiming

What This Track Is Training

This track trains one route:

  1. inspect the geometry first
  2. decide whether the data wants centroid, density, or graph structure
  3. check whether the result is stable across seeds
  4. use low-dimensional views as inspection aids, not as proof
  5. keep one non-visual check that would still matter if the plot looked worse

First Session

Use this order:

  1. Clustering and Low-Dimensional Views
  2. run academy/.venv/bin/python academy/examples/unsupervised-manifold-recipes/spectral_clustering_demo.py
  3. run academy/.venv/bin/python academy/examples/unsupervised-manifold-recipes/manifold_inspection_demo.py
  4. write one short note on where local connectivity beats the centroid assumption

Full Track Loop

For the complete workflow:

  1. read Clustering and Low-Dimensional Views
  2. run academy/.venv/bin/python academy/examples/unsupervised-manifold-recipes/spectral_clustering_demo.py
  3. run academy/.venv/bin/python academy/examples/unsupervised-manifold-recipes/manifold_inspection_demo.py
  4. run academy/.venv/bin/python academy/labs/advanced-unsupervised-and-manifold-workflows/src/advanced_unsupervised_workflow.py
  5. finish the matching exercises in academy/exercises/advanced-unsupervised-and-manifold-workflows/
  6. keep one note with the chosen method, one stability check, and one claim you refuse to make from the embedding alone

What To Inspect

By the end of the track, the learner should have inspected:

  • whether the geometry looks compact, curved, fragmented, or noisy
  • how spectral clustering changes the answer relative to KMeans
  • how much noise DBSCAN leaves unlabeled and whether that is useful
  • whether labels remain stable across seeds using a check such as pairwise ARI
  • whether the embedding preserves enough neighborhood structure to support inspection
  • one non-visual summary that is stronger than the prettiest scatter plot

Common Failure Modes

  • forcing KMeans onto data that is obviously not centroid-shaped
  • treating a good-looking embedding as proof of global structure
  • choosing a method from one lucky seed without checking stability
  • reading DBSCAN noise as failure when the refusal to assign is the honest answer
  • forgetting to ask whether a simpler geometric assumption already fits well enough

Exit Standard

Before leaving this track, the learner should be able to:

  • explain why the chosen method matches the data geometry better than the alternatives
  • name one stability check that matters more than one polished run
  • use an embedding for inspection without claiming it proves the full structure
  • say when density, graph, or centroid thinking is the most honest first move

That is enough to move into harder multimodal or representation-heavy workflows without treating every plot as truth.