Skip to content

60% more efficient autoresearch via better training analysis#353

Open
ottogin wants to merge 1 commit intokarpathy:masterfrom
ottogin:patch-2
Open

60% more efficient autoresearch via better training analysis#353
ottogin wants to merge 1 commit intokarpathy:masterfrom
ottogin:patch-2

Conversation

@ottogin
Copy link

@ottogin ottogin commented Mar 20, 2026

Hi! While experimenting with autoresearch, I noticed that the agent has very limited observability into the training process and rarely looks beyond the final validation loss.

I updated train.py to log more training statistics and added an analysis step where the agent uses Python to inspect training dynamics. This consistently improves BPB.

Comparison of autoresearch vs auto-log-research

I ran this comparison multiple times—there’s some noise, but extended logging + analysis consistently leads to lower BPB. Experiments were run on H100 with Claude Opus 4.6 via Claude Code.

I think this could be helpful for others working with autoresearch, so in this PR I’m adding a link to my code as a notable fork. I’m also happy to submit a PR with all the changes to the main repo if you think that makes sense.

Details: https://github.com/ottogin/auto-log-research

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant