feat: Add SkyRL backend for unified trainer#407
feat: Add SkyRL backend for unified trainer#407jeewoo-lee wants to merge 66 commits intorllm-org:mainfrom
Conversation
…fig now added. need to verify this in runtime
…into unified-trainer
…fied-trainer Getting changes Aaron made to fix bugs.
Successfully integrated skyrl to run training. Remaining tasks: - Make rollout engine accept validate=True - Use rllm advantage in rollout engine
…perimental/skyrl/
…ved skyrl_engine into experimental folder
upstream/main to keep the PR focused on SkyRL integration only.
|
@listar2000 I have made PR for skyrl. Thank you! |
There was a problem hiding this comment.
I've lately changed this pattern a bit -- now put this package-dependent searchpath into your skyrl.yaml -- maybe go to verl.yaml for reference. This prevents import warning for non-skyrl users.
Also make sure you check that the script runs Okay with this change.
listar2000
left a comment
There was a problem hiding this comment.
Thanks for the great work @jeewoo-lee -- please see my comments (there are quite a few -- but overall I'm very positive about these contributions. This PR is mergeable once these comments are addressed).
There was a problem hiding this comment.
Have you confirmed that we have to do the below in order to eliminate import error?
I've installed a fresh rllm on a new machine and want to train with Tinker -- turns out there's no issue with the existing code. So I think we can revert these changes (unless the error happened to you during a fresh installation).
| from pathlib import Path | ||
|
|
||
| # Add skyrl-train to Python path | ||
| skyrl_train_path = Path(__file__).parent.parent.parent.parent / "skyrl" / "skyrl-train" |
There was a problem hiding this comment.
Any reason that we might need these hard-coded paths (instead of doing normal import)? I'm worried that the users might not install the skyrl as source code and put them in the location you specify.
| else: | ||
| return None | ||
|
|
||
| def _convert_rllm_dataset_to_skyrl_file(self, rllm_dataset: Dataset | None) -> str | None: |
There was a problem hiding this comment.
Despite seem to work -- this function is a bit strange in the sense that rLLM actually creates two .parquet file for each dataset: one regular and the other one for Verl. Are both imcompatible for SkyRL to use? Also isn't the construction of messages handled by the workflow side?
There was a problem hiding this comment.
Also these functions (utilities) should not be in the launcher file.
There was a problem hiding this comment.
Do you have any recent wandb training log? Just want to verify that the logged metrics match the other backends (esp Tinker).
rllm/experimental/skyrl/transform.py
Outdated
| from typing import TYPE_CHECKING | ||
|
|
||
| from rllm.agents.agent import TrajectoryGroup | ||
| from rllm.engine.rollout import ModelOutput |
There was a problem hiding this comment.
Use the ModelOutput in experimental.rollout.
| - trainer.algorithm.advantage_estimator: Advantage estimator (grpo, gae, rloo, reinforce++) | ||
| - algorithm.use_rllm: Whether to use rLLM-native advantage computation (default: false) | ||
| - rllm.stepwise_advantage.enable: Enable stepwise advantage computation | ||
| - rllm.stepwise_advantage.mode: Stepwise mode (broadcast or per_step) |
There was a problem hiding this comment.
Per_step is now deprecated -- maybe also check your code for related parts (no need for these logic).
|
Btw make sure you pull from main first as I've merged a few other PRs lately @jeewoo-lee |
…t tracking.py to upstream - Move hydra searchpath (verl + skyrl) into unified.yaml for centralized config. Hydra searchpath entries are resolved lazily, so pkg://skyrl_train.config won't error for non-skyrl users — it only matters when rllm/backend=skyrl is selected. - Add logger config bridging in skyrl.yaml - Add Modal launcher (modal_run.py) and training script for solver-judge workflow - Add dataset fallback loading in test_skyrl_solver_judge.py - Revert tracking.py to upstream (restore batch uploads, session URLs, eval logging)
# Conflicts: # .gitignore
|
Hi @listar2000, I made the changes:
|
Summary
don't crash on import
New files
experimental/skyrl/skyrl_backend.pyBackendProtocolimplementation for SkyRLexperimental/skyrl/skyrl_launcher.pyexperimental/skyrl/data_adapter.pyTrainingInputBatchexperimental/skyrl/transform.pyexperimental/skyrl/skyrl_metrics_utils.pyexperimental/rollout/skyrl_engine.pyInferenceEngineClientto rLLM'sRolloutEngineexperimental/config/rllm/backend/skyrl.yamlexperimental/test/test_skyrl_*.py/.shModified files
pyproject.toml— adds[skyrl]optional dependency grouprollout/__init__.py— addsSkyRLEngineto lazy imports, wraps all engine/type imports intry/except guards
.gitignore— adds/skyrlandclaude/*config/unified.yaml— adds skyrl config reference