Skip to content

Store checkpoints on wandb during offline training#54

Open
Felhof wants to merge 1 commit intojbloomAus:mainfrom
Felhof:add_checkpoints_during_offline_training
Open

Store checkpoints on wandb during offline training#54
Felhof wants to merge 1 commit intojbloomAus:mainfrom
Felhof:add_checkpoints_during_offline_training

Conversation

@Felhof
Copy link
Copy Markdown
Contributor

@Felhof Felhof commented Apr 27, 2023

Closes #39

Similar to how storing of PPO checkpoints works, the number of checkpoints can be set using a command line argument.

@jbloomAus
Copy link
Copy Markdown
Owner

Thanks Felix!

I'm wondering if you might have time to check that checkpoints can be easily downloaded and loaded into the streamlit app/calibration workflow and maybe if you could also write some tests to check things like:

  • do you get exactly the right number of checkpoints
  • Is the final model the final checkpoint
    And anything else you can think of.

I feel bad for not adding more detail originally so feel free to let me get to this eventually or to ping in future when you start a card and I'll make sure there are details there.

Now the fun stuff, this PR opens the doors to some cool work. If you make sure we can check calibration of each checkpoint then having a calibration curve variation that shows calibration over each checkpoint would be cool. A visualization for this might be interesting.

I'm not sure if you've read Neel's grokking work/the progress measures stuff, but we are now theoretically closer to asking questions about training dynamics. This will be a little blocked ATM on the lack of fine-grained circuit analysis and I'm happy to talk about what I'm building soon that would help this, possibly you might also want to help.

Some open questions I have off the top of my head:

  • how much do the output vector directions change during training
  • how much do component contributions change over training
  • how do we measure excluded loss/which situations should we pick in the game world to focus on when looking at circuit evolution.

Can't write more now but theres more. Pm me if you are keen, no fuss either way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add checkpoints during Offline Training

2 participants