Build an LLM from scratch with MAX

Build an LLM from scratch using Modular's MAX platform. This hands-on tutorial teaches transformer architecture through 12 progressive steps, from basic embeddings to text generation.

What you'll learn

Transformer architecture: Understand every component of GPT-2
MAX Python API: Learn MAX's nn.module_v3 for building neural networks
Test-driven learning: Validate your implementation at each step
Production patterns: HuggingFace-compatible architecture design

Quick start

Prerequisites

Modular MAX installed
Pixi package manager
Python 3.9+
Basic understanding of neural networks

Installation

# Clone or navigate to this directory
cd max-llm-book

# Install dependencies with pixi
pixi install

Running the tutorial

Each step has a skeleton file to implement and a test to verify:

# Run tests for a specific step
pixi run s01  # Step 1: Model configuration
pixi run s05  # Step 5: Token embeddings
pixi run s12  # Step 12: Text generation

# View the tutorial book
pixi run book

Tutorial structure

The tutorial follows a progressive learning path:

Steps	Focus	What you build
01-04	Foundations	Configuration, layer norm, MLP, causal masking
05-06	Embeddings	Token and position embeddings
07	Attention	Multi-head attention
08-09	Composition	Residual connections, transformer blocks
10-12	Complete model	Stacking blocks, language model head, text generation

Each step includes:

Conceptual explanation: What and why
Implementation tasks: Skeleton code with TODO markers
Validation tests: 5-phase verification (imports, structure, implementation, placeholders, functionality)
Reference solution: Complete working implementation

Project structure

max-llm-book/
├── book/                  # mdBook tutorial documentation
│   └── src/
│       ├── introduction.md
│       ├── step_01.md ... step_12.md
│       └── SUMMARY.md
├── steps/                 # Skeleton files for learners
│   ├── step_01.py
│   └── ... step_12.py
├── solutions/             # Complete reference implementations
│   ├── solution_01.py
│   └── ... solution_12.py
├── tests/                 # Validation tests for each step
│   ├── test.step_01.py
│   └── ... test.step_12.py
├── main.py               # Complete working GPT-2 implementation
├── pixi.toml             # Project dependencies and tasks
└── README.md             # This file

How to use this tutorial

For first-time learners

Read the introduction: pixi run book and read the introduction
Work sequentially: Start with Step 01 and work through in order
Implement each step: Fill in TODOs in steps/step_XX.py
Validate with tests: Run pixi run sXX to verify your implementation
Compare with solution: Check solutions/solution_XX.py if stuck

For experienced developers

Jump to specific topics: Each step is self-contained
Use as reference: Check solutions for MAX API patterns
Explore main.py: See the complete implementation

Running tests

# Test a single step
pixi run s01

# Test multiple steps
pixi run s05 && pixi run s06 && pixi run s07

# Run all tests
pixi run test-all

Understanding test output

Failed test (skeleton code):

❌ Embedding is not imported from max.nn.module_v3
   Hint: Add 'from max.nn.module_v3 import Embedding, Module'

Passed test (completed implementation):

✅ Embedding is correctly imported from max.nn.module_v3
✅ GPT2Embeddings class exists
✅ All placeholder 'None' values have been replaced
🎉 All checks passed! Your implementation is complete.

Complete GPT-2 example

The main.py file contains a complete, working GPT-2 implementation that you can run:

# Run the complete model (requires HuggingFace weights)
pixi run huggingface

This demonstrates how all components fit together in production.

Common issues

Import errors

ModuleNotFoundError: No module named 'max'

Solution: Run pixi install to install MAX and dependencies.

Test failures

If tests fail unexpectedly, ensure you're in the correct directory and have completed the step's TODOs.

Device compatibility

The examples use CPU for simplicity. For GPU acceleration, update device=CPU() to device=GPU() where appropriate.

Learning resources

MAX Documentation: docs.modular.com/
Tutorial Book: Run pixi run book for the full interactive guide
HuggingFace GPT-2: huggingface.co/gpt2
Attention Is All You Need: Original transformer paper

Contributing

Found an issue or want to improve the tutorial? Contributions welcome:

File issues for bugs or unclear explanations
Suggest improvements to test coverage
Add helpful examples or visualizations

Next steps after completion

Once you've completed all 12 steps:

Experiment with generation: Modify temperature, sampling strategies in Step 12
Analyze attention: Visualize attention weights from your model
Optimize performance: Profile and optimize with MAX's compilation tools
Build something new: Apply these patterns to custom architectures

Ready to start? Run pixi run book to open the interactive tutorial, or jump straight to pixi run s01 to begin!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
book		book
scripts		scripts
solutions		solutions
steps		steps
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
WEIGHT_LOADING.md		WEIGHT_LOADING.md
gpt2_tutorial.ipynb		gpt2_tutorial.ipynb
main.py		main.py
pixi.lock		pixi.lock
pixi.toml		pixi.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Build an LLM from scratch with MAX

What you'll learn

Quick start

Prerequisites

Installation

Running the tutorial

Tutorial structure

Project structure

How to use this tutorial

For first-time learners

For experienced developers

Running tests

Understanding test output

Complete GPT-2 example

Common issues

Import errors

Test failures

Device compatibility

Learning resources

Contributing

Next steps after completion

About

Uh oh!

Contributors 2

Languages

License

modular/max-llm-book

Folders and files

Latest commit

History

Repository files navigation

Build an LLM from scratch with MAX

What you'll learn

Quick start

Prerequisites

Installation

Running the tutorial

Tutorial structure

Project structure

How to use this tutorial

For first-time learners

For experienced developers

Running tests

Understanding test output

Complete GPT-2 example

Common issues

Import errors

Test failures

Device compatibility

Learning resources

Contributing

Next steps after completion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Languages