Skip to content

[train] Make vLLMRouter the default router in the new inference codepath#1394

Merged
SumanthRH merged 17 commits intomainfrom
make-vllm-router-default
Mar 27, 2026
Merged

[train] Make vLLMRouter the default router in the new inference codepath#1394
SumanthRH merged 17 commits intomainfrom
make-vllm-router-default

Conversation

@SumanthRH
Copy link
Copy Markdown
Member

@SumanthRH SumanthRH commented Mar 26, 2026

What does this PR do?

Makes vLLMRouter the default router in the new inference codepath, following the migration plan in #1385

Test Plan

I ran the DAPO recipe on Qwen 1.7B https://github.com/NovaSky-AI/SkyRL/blob/e3dc00d43430f5d26a6af086d9f1a0f2dd2faef4/examples/train/algorithms/dapo/run_dapo_qwen3_1.7b_aime.sh

as well as the SkyRLSQL recipe https://github.com/NovaSky-AI/SkyRL/blob/e3dc00d43430f5d26a6af086d9f1a0f2dd2faef4/examples/train/text_to_sql/run_skyrl_sql.sh

Training progresses in a stable way without generation issues

DAPO recipe: Results

image image

SkyRLSQL Recipe: Results

image image

TODO:

  • DAPO sync
  • Multi-turn text2sql sync

Open with Devin

SumanthRH and others added 14 commits March 25, 2026 04:39
Add VLLMRouter class that spawns the vllm-router Rust binary as a
subprocess, providing a drop-in alternative to the Python InferenceRouter.
Uses consistent_hash policy by default which respects X-Session-ID headers
for session-aware routing.

Configurable via `generator.inference_engine.router_type="vllm-router"`
(default: "internal"). Optional dependency: `pip install skyrl[vllm-router]`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uses setdefault so users can still override with RUST_LOG=info/debug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both "default" and "vllm-router" are internally managed by SkyRL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
- Remove InferenceRouter and its tests
- Remove router_type config — VLLMRouter is now the only router
- Add vllm-router to skyrl-train, fsdp, and megatron deps
- Remove vllm-router optional extra (no longer needed)
- Update all references: main_base, benchmarks, GPU CI, test utils

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All conflicts resolved in favor of VLLMRouter-only (HEAD).
New content from main (render endpoint, session_id extraction) preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH changed the title Make vllm router default [train] Make vLLMRouter the default router in the new inference codepath Mar 26, 2026
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review March 27, 2026 17:13
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the Python-based InferenceRouter implementation and standardizes the system on the Rust-based vllm-router via the VLLMRouter wrapper. Key changes include the deletion of router.py and its associated tests, updates to pyproject.toml to include vllm-router as a standard Linux dependency for training and FSDP extras, and the refactoring of CI scripts, benchmarks, and core entry points to use VLLMRouter exclusively. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

⚠️ 1 issue in files not directly in the diff

⚠️ Config field router_type is orphaned dead code referencing deleted InferenceRouter (skyrl/train/config/config.py:481-485)

The router_type field at skyrl/train/config/config.py:481-485 is still present with its default "default" and docstring referencing InferenceRouter, but the only consumer—_create_router() in skyrl/train/entrypoints/main_base.py—was deleted in this PR. The field is now completely dead: setting router_type to any value (including "default" or "vllm-router") has no effect whatsoever, and the docstring misleadingly tells users they can choose between InferenceRouter (which no longer exists) and vllm-router. This is an incomplete migration—the config field should have been removed or updated alongside the code that consumed it.

View 4 additional findings in Devin Review.

Open in Devin Review

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH
Copy link
Copy Markdown
Member Author

SumanthRH commented Mar 27, 2026

Config field router_type is orphaned dead code referencing deleted InferenceRouter

router_type was removed in 32a7113, but got reintroduced by the merge commit (default strategy). Removed.

@SumanthRH SumanthRH merged commit 7e2832b into main Mar 27, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant