Skip to content

Add blog plot generation with real compilation results#44

Merged
twiecki merged 3 commits intomainfrom
claude/blog-plots-o8eEW
Mar 13, 2026
Merged

Add blog plot generation with real compilation results#44
twiecki merged 3 commits intomainfrom
claude/blog-plots-o8eEW

Conversation

@twiecki
Copy link
Copy Markdown
Collaborator

@twiecki twiecki commented Mar 12, 2026

Summary

  • Add examples/generate_blog_plots.py that runs the full compile → optimize → benchmark pipeline on real models
  • Fix bug in plot_optimization_progress where annotation loop used float values instead of _BenchmarkRecord objects
  • Include generated plots and optimization results (TSV)

Benchmark Results (real runs)

Model PyTensor (Numba) AI-compiled Rust Speedup
Normal (2 params) 2.46 us/eval 0.06 us/eval 38.5x
LinReg (3 params) 3.50 us/eval 0.49 us/eval 7.1x
Hierarchical (12 params) 5.04 us/eval 0.57 us/eval 8.8x

Generated Plots

  • benchmark_comparison.png — side-by-side speed + speedup bars
  • opt_progress_normal.png / opt_progress_hierarchical.png — autoresearcher-style optimization progress
  • opt_waterfall_*.png — contribution of each optimization step
  • opt_timeline_*.png — wall-clock event timeline

Test plan

  • Script runs end-to-end: uv run python examples/generate_blog_plots.py
  • All 7 plots generated successfully
  • Bug fix verified: plot_optimization_progress works from TSV

https://claude.ai/code/session_01CZ2Vu6Y6vjtqEZLAsjcMF3

claude added 3 commits March 12, 2026 09:22
The loop variable `rec` was a float (from keep_us list) but the code
tried to access rec.code_hash. Fixed by zipping with the actual
_BenchmarkRecord objects.

https://claude.ai/code/session_01CZ2Vu6Y6vjtqEZLAsjcMF3
- generate_blog_plots.py: runs compile + optimize + benchmark pipeline
- Normal model: 38.5x faster than PyTensor (0.06 vs 2.46 us/eval)
- LinReg model: 7.1x faster (0.49 vs 3.50 us/eval)
- Hierarchical model: 8.8x faster (0.57 vs 5.04 us/eval)
- Optimization progress plots (autoresearcher-style)
- Waterfall and timeline plots for both models
- Benchmark comparison bar chart

https://claude.ai/code/session_01CZ2Vu6Y6vjtqEZLAsjcMF3
Adds make_gp_model() factory using pm.gp.Latent with ExpQuad kernel,
includes GP in the compile+optimize pipeline and benchmark comparison.

https://claude.ai/code/session_01CZ2Vu6Y6vjtqEZLAsjcMF3
@twiecki twiecki merged commit 2c491df into main Mar 13, 2026
2 of 3 checks passed
@twiecki twiecki deleted the claude/blog-plots-o8eEW branch March 13, 2026 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants