Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: fix cuda graph padding for spec decoding
#4853 opened Jun 3, 2025 by lfr-0531 Loading…
enh: Enable trtllm-bench to run LoRA PyT flow
#4848 opened Jun 3, 2025 by venkywonka Loading…
4 of 5 tasks
[DO NOT MERGE] test PR to test action
#4847 opened Jun 2, 2025 by poweiw Draft
[Doc] Fix readme for disaggregated serving
#4846 opened Jun 2, 2025 by arekay Loading…
Update code owner list
#4839 opened Jun 2, 2025 by juney-nvidia Loading…
Chore: refine prepre inputs method of model engine
#4837 opened Jun 2, 2025 by QiJune Loading…
Draft: feat: trtllm sampler log probs
#4836 opened Jun 2, 2025 by dcampora Loading…
Fix: NVBug 5302895
#4835 opened Jun 2, 2025 by Shixiaowei02 Loading…
test
#4831 opened Jun 2, 2025 by yunruis Loading…
fix: make scale_bmm2_d optional in epilogue too
#4823 opened Jun 1, 2025 by hypdeb Loading…
[TRTLLM-4923][feat] Paged mamba cache
#4822 opened Jun 1, 2025 by tomeras91 Loading…
Draft: test: [CI] remove closed bugs
#4821 opened Jun 1, 2025 by xinhe-nv Draft
[Arch] Freeze model_config
#4814 opened May 31, 2025 by hlu1 Loading…
feat: enable streaming weight updates
#4813 opened May 30, 2025 by hchings Draft
3 tasks
[enhanchment] Add beam width to low latency.
#4812 opened May 30, 2025 by FrankD412 Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.