-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][fix] Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa
#13578
opened Apr 28, 2026 by
SimengLiu-nv
Collaborator
Loading…
1 task done
[None][fix] Reject KV connector + host offloading at construction time
#13577
opened Apr 28, 2026 by
jthomson04
Collaborator
Loading…
2 tasks done
[fix] add kv_scales/inv_kv_scales to FP8BlockScalesLinearMethod and FP8RowwiseLinearMethod
Community want to contribute
PRs initiated from Community
#13576
opened Apr 28, 2026 by
jjia-droid
Loading…
1 task
[TRTLLM-12316][feat] Support FP4 indexer
#13575
opened Apr 28, 2026 by
mikeiovine
Collaborator
•
Draft
1 task
[https://nvbugs/5615248][fix] Broader capture of piecewise cudagraph
#13574
opened Apr 28, 2026 by
brb-nv
Collaborator
Loading…
1 task done
[https://nvbugs/6104831][fix] Detach pruned trie children
#13572
opened Apr 28, 2026 by
chienchunhung
Collaborator
•
Draft
1 task done
[https://nvbugs/6104831][test] Add reproducers for KV-cache-block cascade prune assertion
#13571
opened Apr 28, 2026 by
chienchunhung
Collaborator
•
Draft
1 task done
[TRTLLM-12128][feat] enable SageAttention for Wan/FLUX (new commits)
#13570
opened Apr 28, 2026 by
xrq-phys
Collaborator
Loading…
1 task done
[None][feat] Support fractional synthetic acceptance rates
#13569
opened Apr 28, 2026 by
mikeiovine
Collaborator
•
Draft
1 task
[TRTLLM-12338][feat] Lift TOKENIZER_ALIASES to module level in llmapi.llm_args
#13568
opened Apr 28, 2026 by
nv-yna
Collaborator
Loading…
3 tasks done
[None][feat] Use LPIPS for FLUX regression test
#13567
opened Apr 28, 2026 by
yibinl-nvidia
Collaborator
•
Draft
1 task
[https://nvbugs/6120981][fix] Switch to cu_seqlens_to_chunk_indices_offsets_triton with total_seqlens/extra_ch
#13566
opened Apr 28, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6117814][fix] Lower Eagle3 one-model acceptance rate threshold for H20 GPU
#13565
opened Apr 28, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][fix] Always sync local ranks after prefetch in HfWeightLoader
#13564
opened Apr 28, 2026 by
lancelly
Collaborator
Loading…
[None][feat] Required changes for DeepSeek‑V4 Disaggregated Serving
#13563
opened Apr 28, 2026 by
Shixiaowei02
Collaborator
•
Draft
1 task
[None][fix] Always sync local ranks after prefetch in HfWeightLoader
#13556
opened Apr 28, 2026 by
lancelly
Collaborator
Loading…
[https://nvbugs/6087632][fix] fix test def to use local model
#13555
opened Apr 28, 2026 by
bo-nv
Collaborator
Loading…
1 task
[None][Refactor] Minor refactor SSM page table for extensibility
#13554
opened Apr 28, 2026 by
Shixiaowei02
Collaborator
•
Draft
1 task
[None][test] Test coverage and repro for #13320
#13553
opened Apr 28, 2026 by
eopXD
Collaborator
Loading…
1 task done
[https://nvbugs/6114821][fix] Remove torch.compile from spec dec sampling to prevent NCCL deadlock
#13552
opened Apr 28, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][test] Unwaive DSR1 V32 Agg TEP tests
#13550
opened Apr 28, 2026 by
chenfeiz0326
Collaborator
Loading…
1 task done
[None][feat] Improve memory calculation for mamba hybrid models when block reuse is off
#13549
opened Apr 28, 2026 by
VALLIS-NERIA
Collaborator
Loading…
1 task done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.