Skip to content

[codex] Add experimental Dynamo backend#2472

Draft
samsja wants to merge 1 commit into
mainfrom
feat/dynamo-backend-only
Draft

[codex] Add experimental Dynamo backend#2472
samsja wants to merge 1 commit into
mainfrom
feat/dynamo-backend-only

Conversation

@samsja
Copy link
Copy Markdown
Member

@samsja samsja commented May 11, 2026

Summary

  • Add an optional prime-rl[dynamo] dependency extra and inference.backend = "dynamo" config surface.
  • Add a self-contained prime_rl.experimental.dynamo launcher and worker patch: generation goes through Dynamo's /v1/chat/completions frontend/router, while the worker only exposes /engine/* admin routes for liveness, NCCL setup, and weight updates.
  • Auto-configure RL runs to use MITO with Dynamo, route admin calls to the worker system port, and include a checked-in Hendrycks Math/Qwen4B/AIME25 batch-64 config.
  • Keep Dynamo distributed-mode compatibility isolated by avoiding unsupported cache_salt, return_token_ids, and generation logprob request parameters on Dynamo runs.

Validation

  • uv run ruff check packages/prime-rl-configs/src/prime_rl/configs/inference.py packages/prime-rl-configs/src/prime_rl/configs/rl.py src/prime_rl/experimental/dynamo/server.py src/prime_rl/experimental/dynamo/worker.py src/prime_rl/utils/client.py src/prime_rl/entrypoints/rl.py tests/unit/test_configs.py
  • uv run ruff check packages/prime-rl-configs/src/prime_rl/configs/orchestrator.py src/prime_rl/orchestrator/envs.py src/prime_rl/orchestrator/orchestrator.py src/prime_rl/orchestrator/scheduler.py src/prime_rl/utils/client.py
  • uv run pytest tests/unit/test_configs.py -q
  • uv run --extra dynamo rl @ configs/experimental/dynamo/hendrycks_math_qwen4b_aime25.toml --dry-run --output-dir /tmp/prime-rl-dynamo-router-dryrun
  • uv run --extra dynamo inference @ /tmp/prime-rl-dynamo-router-dryrun/configs/inference.toml --dry-run
  • CUDA_VISIBLE_DEVICES=0,1 uv run --extra dynamo rl @ examples/reverse_text/rl.toml --inference.backend dynamo --max-steps 1 --orchestrator.batch-size 8 --orchestrator.rollouts-per-example 2 --orchestrator.max-inflight-rollouts 8 --wandb.name reverse-text-dynamo-smoke-no-logprobs --output-dir /tmp/prime-rl-reverse-dynamo-smoke-no-logprobs W&B: https://wandb.ai/primeintellect/reverse-text/runs/5f46beef5ff941abbd65a16b1c412417
  • 500-step Qwen4B/Hendrycks-Math + AIME25 validation started and passed initial training steps:

@samsja samsja force-pushed the feat/dynamo-backend-only branch from ca27140 to d552750 Compare May 11, 2026 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant