-
Notifications
You must be signed in to change notification settings - Fork 293
[bug][train] Fix max_seq_len calculation #1303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -276,6 +276,14 @@ def validate_cfg(cfg: SkyRLTrainConfig): | |
| f"invalid loss_reduction: {cfg.trainer.algorithm.loss_reduction}. " | ||
| f"Must be one of `['token_mean', 'sequence_mean', 'seq_mean_token_sum_norm']`" | ||
| ) | ||
| if cfg.trainer.algorithm.loss_reduction == "seq_mean_token_sum_norm": | ||
| if cfg.trainer.algorithm.max_seq_len is None: | ||
| raise ValueError( | ||
| "`trainer.algorithm.max_seq_len` must be set explicitly when " | ||
| "`trainer.algorithm.loss_reduction='seq_mean_token_sum_norm'`. " | ||
| "Choose the total sequence-length normalization constant for your setup; " | ||
| "this often matches the model context window / vLLM `max_model_len` when appropriate." | ||
|
Comment on lines
+279
to
+285
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Breaking change: Dr. GRPO example script fails because auto-calculated max_seq_len fallback was removed The PR removes the Same issue in skyrl-agent example
Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback.
Comment on lines
+279
to
+285
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Breaking change: Dr. GRPO example script fails because auto-calculated max_seq_len fallback was removed The PR removes the Same issue in skyrl-agent example
Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
| ) | ||
|
|
||
| # TODO (erictang000): remove this after deprecation period | ||
| if cfg.trainer.algorithm.use_tis: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Stale comment in reduce_loss claims max_seq_len has a default fallback that no longer exists
At
skyrl/backends/skyrl_train/utils/ppo_utils.py:999-1000, the comment says"NOTE: max_seq_len can be set explicitly via algorithm.max_seq_len, otherwise defaults to cfg.generator.max_input_length + cfg.generator.sampling_params.max_generate_length". This auto-calculation default was removed by this PR (deleted fromskyrl/train/config/config.py:713-722), andmax_seq_lenmust now always be set explicitly when usingseq_mean_token_sum_norm. The stale comment will mislead developers into thinking a fallback still exists.Prompt for agents
Was this helpful? React with 👍 or 👎 to provide feedback.