Build an LLM from scratch using Modular's MAX platform. This hands-on tutorial teaches transformer architecture through 12 progressive steps, from basic embeddings to text generation.
- Transformer architecture: Understand every component of GPT-2
- MAX Python API: Learn MAX's
nn.module_v3for building neural networks - Test-driven learning: Validate your implementation at each step
- Production patterns: HuggingFace-compatible architecture design
- Modular MAX installed
- Pixi package manager
- Python 3.9+
- Basic understanding of neural networks
# Clone or navigate to this directory
cd max-llm-book
# Install dependencies with pixi
pixi installEach step has a skeleton file to implement and a test to verify:
# Run tests for a specific step
pixi run s01 # Step 1: Model configuration
pixi run s05 # Step 5: Token embeddings
pixi run s12 # Step 12: Text generation
# View the tutorial book
pixi run bookThe tutorial follows a progressive learning path:
| Steps | Focus | What you build |
|---|---|---|
| 01-04 | Foundations | Configuration, layer norm, MLP, causal masking |
| 05-06 | Embeddings | Token and position embeddings |
| 07 | Attention | Multi-head attention |
| 08-09 | Composition | Residual connections, transformer blocks |
| 10-12 | Complete model | Stacking blocks, language model head, text generation |
Each step includes:
- Conceptual explanation: What and why
- Implementation tasks: Skeleton code with TODO markers
- Validation tests: 5-phase verification (imports, structure, implementation, placeholders, functionality)
- Reference solution: Complete working implementation
max-llm-book/
├── book/ # mdBook tutorial documentation
│ └── src/
│ ├── introduction.md
│ ├── step_01.md ... step_12.md
│ └── SUMMARY.md
├── steps/ # Skeleton files for learners
│ ├── step_01.py
│ └── ... step_12.py
├── solutions/ # Complete reference implementations
│ ├── solution_01.py
│ └── ... solution_12.py
├── tests/ # Validation tests for each step
│ ├── test.step_01.py
│ └── ... test.step_12.py
├── main.py # Complete working GPT-2 implementation
├── pixi.toml # Project dependencies and tasks
└── README.md # This file
- Read the introduction:
pixi run bookand read the introduction - Work sequentially: Start with Step 01 and work through in order
- Implement each step: Fill in TODOs in
steps/step_XX.py - Validate with tests: Run
pixi run sXXto verify your implementation - Compare with solution: Check
solutions/solution_XX.pyif stuck
- Jump to specific topics: Each step is self-contained
- Use as reference: Check solutions for MAX API patterns
- Explore main.py: See the complete implementation
# Test a single step
pixi run s01
# Test multiple steps
pixi run s05 && pixi run s06 && pixi run s07
# Run all tests
pixi run test-allFailed test (skeleton code):
❌ Embedding is not imported from max.nn.module_v3
Hint: Add 'from max.nn.module_v3 import Embedding, Module'
Passed test (completed implementation):
✅ Embedding is correctly imported from max.nn.module_v3
✅ GPT2Embeddings class exists
✅ All placeholder 'None' values have been replaced
🎉 All checks passed! Your implementation is complete.
The main.py file contains a complete, working GPT-2 implementation that you can run:
# Run the complete model (requires HuggingFace weights)
pixi run huggingfaceThis demonstrates how all components fit together in production.
ModuleNotFoundError: No module named 'max'Solution: Run pixi install to install MAX and dependencies.
If tests fail unexpectedly, ensure you're in the correct directory and have completed the step's TODOs.
The examples use CPU for simplicity.
For GPU acceleration, update device=CPU() to device=GPU() where appropriate.
- MAX Documentation: docs.modular.com/
- Tutorial Book: Run
pixi run bookfor the full interactive guide - HuggingFace GPT-2: huggingface.co/gpt2
- Attention Is All You Need: Original transformer paper
Found an issue or want to improve the tutorial? Contributions welcome:
- File issues for bugs or unclear explanations
- Suggest improvements to test coverage
- Add helpful examples or visualizations
Once you've completed all 12 steps:
- Experiment with generation: Modify temperature, sampling strategies in Step 12
- Analyze attention: Visualize attention weights from your model
- Optimize performance: Profile and optimize with MAX's compilation tools
- Build something new: Apply these patterns to custom architectures
Ready to start? Run pixi run book to open the interactive tutorial, or jump straight to pixi run s01 to begin!