Store checkpoints on wandb during offline training#54
Store checkpoints on wandb during offline training#54Felhof wants to merge 1 commit intojbloomAus:mainfrom
Conversation
|
Thanks Felix! I'm wondering if you might have time to check that checkpoints can be easily downloaded and loaded into the streamlit app/calibration workflow and maybe if you could also write some tests to check things like:
I feel bad for not adding more detail originally so feel free to let me get to this eventually or to ping in future when you start a card and I'll make sure there are details there. Now the fun stuff, this PR opens the doors to some cool work. If you make sure we can check calibration of each checkpoint then having a calibration curve variation that shows calibration over each checkpoint would be cool. A visualization for this might be interesting. I'm not sure if you've read Neel's grokking work/the progress measures stuff, but we are now theoretically closer to asking questions about training dynamics. This will be a little blocked ATM on the lack of fine-grained circuit analysis and I'm happy to talk about what I'm building soon that would help this, possibly you might also want to help. Some open questions I have off the top of my head:
Can't write more now but theres more. Pm me if you are keen, no fuss either way! |
Closes #39
Similar to how storing of PPO checkpoints works, the number of checkpoints can be set using a command line argument.