Track 04
PyTorch Training Recipes
This track turns deep-learning mechanics into a stable workflow: clean loops, honest validation, readable curves, checkpoint discipline, and recipe changes that happen before architecture changes.
Primary Goal
Stabilize Training First
Use this track when the model is not the real bottleneck yet and the workflow still needs to become trustworthy.
You Will Practice
Loops, Curves, Checkpoints
train() and eval(), no_grad(), optimizer and regularization choices, and validation-based checkpoint selection.
Exit Rule
One Defensible Recipe Note
You are done when you can say which recipe you trust, why the curves support it, and what the next smallest change should be.
Use This Track When¶
- classical evaluation discipline already feels stable
- you are ready to debug deep-learning workflows without hiding behind architecture changes
- you want checkpointing, regularization, and curve reading to feel mechanical
What This Track Is Training¶
This track trains one practical rule:
- change the recipe cleanly before you change the model creatively
That means the learner should be able to keep these explicit:
- where training stops and validation begins
- which checkpoint was selected and why
- whether the curves show underfitting, overfitting, or unstable optimization
- whether a regularization or optimizer change improved the validation story
First Session¶
Use this order:
- PyTorch Training Loops
- Optimizers and Regularization
- run
academy/.venv/bin/python academy/examples/deep-learning-recipes/pytorch_training_loop_demo.py - run
academy/.venv/bin/python academy/examples/deep-learning-recipes/optimizer_regularization_demo.py - write one sentence on whether the main risk is loop correctness, optimization, or regularization
Full Track Loop¶
For the complete workflow:
- review the two deep-learning topics in order
- run the loop and optimizer examples from repo root
- run
academy/.venv/bin/python academy/labs/pytorch-training-recipes/src/training_recipe_workflow.py - inspect the outputs before reading the final test result
- finish the matching exercises in
academy/exercises/pytorch-training-recipes/ - keep one short recipe note with the checkpoint rule, curve read, selected recipe, and next change
What To Inspect¶
By the end of the track, the learner should have inspected:
- one clean separation between
train()andeval() - one validation pass that uses
torch.no_grad() - one pair of training and validation curves
- one best-validation checkpoint versus final-epoch comparison
- one optimizer or regularization change that improved or worsened the validation story
Common Failure Modes¶
- validating without
eval()ortorch.no_grad() - trusting the final epoch because it is the last thing printed
- reading the test result before the curves
- changing architecture before the current recipe has been understood
- treating lower training loss as proof of a better workflow
Exit Standard¶
Before leaving this track, the learner should be able to:
- explain which recipe was selected and why
- justify the checkpoint choice from validation rather than habit
- name the clearest sign of underfitting or overfitting in the curves
- say what the next smallest recipe change should be
That is enough to move into ResNet, BERT, and Fine-Tuning without carrying weak training habits forward.