Skip to content

print experiment hyperparams in final summary#360

Open
mvanhorn wants to merge 1 commit intokarpathy:masterfrom
mvanhorn:osc/feat-config-in-summary
Open

print experiment hyperparams in final summary#360
mvanhorn wants to merge 1 commit intokarpathy:masterfrom
mvanhorn:osc/feat-config-in-summary

Conversation

@mvanhorn
Copy link

The final summary block prints output metrics (val_bpb, peak_vram_mb, etc.) but only one input parameter (depth). Agents running overnight sessions re-read train.py after every experiment just to recall what learning rates or batch size they used.

This adds the key hyperparameters to the existing summary format - seed, batch sizes, learning rates, weight decay, and window pattern. Same key: value layout, no new files or imports.

Complementary to #331 (which covers structured output metrics via results.json - this covers input config via stdout).

Before:

num_params_M:     50.3
depth:            8

After:

num_params_M:     50.3
depth:            8
seed:             42
total_batch:      524288
device_batch:     128
matrix_lr:        0.04
embedding_lr:     0.6
weight_decay:     0.2
window_pattern:   SSSL

This contribution was developed with AI assistance (Claude Code).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant