Skip to content

Add an RL model and associated files#73

Open
rahuls-cerebras wants to merge 6 commits intoCerebras:mainfrom
rahuls-cerebras:rahuls/modelzoo_changes
Open

Add an RL model and associated files#73
rahuls-cerebras wants to merge 6 commits intoCerebras:mainfrom
rahuls-cerebras:rahuls/modelzoo_changes

Conversation

@rahuls-cerebras
Copy link

@rahuls-cerebras rahuls-cerebras commented Oct 9, 2025

A modelzoo RL model wrapper which contains a configurable policy_model
Add a dataloader to read npz files.
Add a hack to store and load checkpoints only for the policy model and not for the wrapper RL model.
Add a policy loss function to calculate loss.
Add a rl model registry

TODOS

  • Think about how to communicate the MSL.
  • It was not required in the existing flows because data preprocessing logic adds padding tokens to ensure the tensors are of size MSL.
  • But in the case of verl, although the MSL is 128K, the tensors floating around are 4K.
  • Think about how to have a permanent solution for the saving and loading of checkpoints

rahul shrivastava and others added 3 commits October 9, 2025 10:59
Signed-off-by: Rahul Shrivastava <rahul.shrivastava@cerebras.net>
Signed-off-by: Rahul Shrivastava <rahul.shrivastava@cerebras.net>
Add policy model subclass as a config param.

Signed-off-by: rahul shrivastava <rahul.shrivastava@cerebras.net>
@rahuls-cerebras rahuls-cerebras changed the title Rahuls/modelzoo changes Add an RL model and associated files Oct 17, 2025
@rahuls-cerebras rahuls-cerebras marked this pull request as ready for review October 22, 2025 02:54
Signed-off-by: Thomas Kidd <thomas@cerebras.net>
…account for different padding for different prompts

Signed-off-by: Thomas Kidd <thomas@cerebras.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants