Skip to content

[rollout] feat: Fix partial load problem, Add vlm support for trtllm rollout#5149

Open
SchumiDing wants to merge 12 commits intoverl-project:mainfrom
SchumiDing:vlm_trtllm_support
Open

[rollout] feat: Fix partial load problem, Add vlm support for trtllm rollout#5149
SchumiDing wants to merge 12 commits intoverl-project:mainfrom
SchumiDing:vlm_trtllm_support

Conversation

@SchumiDing
Copy link
Contributor

@SchumiDing SchumiDing commented Jan 31, 2026

What does this PR do?

Some models do not support partial loading, when this happen the rollout manager back to original full parameter update
Add vlm support for trtllm rollout

Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, veomni, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

Succeed with llama-3-11b-vision model with trtllm rollout

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@SchumiDing
Copy link
Contributor Author

#5042 roadmap of trtllm rollout

from tensorrt_llm.logger import logger


class WorkerExtension:
Copy link
Collaborator

@Superjomn Superjomn Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the latest WorkerExtension in the TensorRT-LLM repo. Are there any motivations for implementing a new one in verl repo? I am thinking about how to unify both. Ideally, we may update the one in the TensorRT-LLM codebase, but if we need a minor change on it before the next trtllm version bump up, @hchings do you have a suggestion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Ideally, we should still use the worker extension from the tedsnort-llm repo. But to support model that do not allow partial loading, I suppose the use of self.engine.model_engine.model_loader.reload should be able to use with param: allow_partial_loading=False

Copy link
Contributor Author

@SchumiDing SchumiDing Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add this new worker extension is to support allow_partial_loading=False, cause tensorrt-llm always set this param as True, but some models do not support

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll prefer that we keep this at TensorRT-LLM repo instead and make it generic for other RL FWs to reuse in the future.

if self.is_vlm_model:
from tensorrt_llm.inputs.multimodal import MultimodalServerConfig

multimodal_config = MultimodalServerConfig(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a unittest for this new feature? There is a test_trtllm_async_server.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I'm adding one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test script and relating test workflow has been added

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the test_trtllm_async_server.py in verl repo, so I write a test script for test on both llm rollout and vlm rollout of tensorrt-llm rollout worker

Copy link
Collaborator

@hchings hchings Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the test_trtllm_async_server.py in verl repo

We have a unittest MR that should be merge shortly, that contains the test_trtllm_async_server.py.

@SchumiDing
Copy link
Contributor Author

Sorry, there may some error when using with latest tensorrt-llm, I'm fixing it

from tensorrt_llm.logger import logger


class WorkerExtension:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll prefer that we keep this at TensorRT-LLM repo instead and make it generic for other RL FWs to reuse in the future.

@SchumiDing
Copy link
Contributor Author

SchumiDing commented Feb 3, 2026

yeah, it's sensible to keep the verl/workers/rollout/trtllm_rollout/trtllm_worker_extension.py in TensorRT-LLM repo
Shall I request a PR to TensorRT-LLM repo first?

@SchumiDing SchumiDing requested a review from hchings February 4, 2026 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants