-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Remove unnecessary duplicated tests from pre-merge H100 DGX
#4852
opened Jun 3, 2025 by
litaotju
Loading…
Replace memset with data initialization within kernels
#4851
opened Jun 3, 2025 by
ChristinaZ
Loading…
enh: Enable trtllm-bench to run LoRA PyT flow
#4848
opened Jun 3, 2025 by
venkywonka
Loading…
4 of 5 tasks
[nvbug/5314469][feat] Include the executor's max batch size in CUDA g…
#4843
opened Jun 2, 2025 by
mikeiovine
Loading…
fix: [nvbugs/5298600] fix illegal memory access on mrope_position_deltas
#4830
opened Jun 2, 2025 by
yechank-nvidia
Loading…
feat: port MakeDecodingBatchInputOutput to python in TRTLLMSampler
#4828
opened Jun 2, 2025 by
dcampora
Loading…
Fix trtllm-bench iter_stats and cuda_graph_batch_sizes error errors.
#4827
opened Jun 2, 2025 by
qiaoxj07
Loading…
chore: memoize weight shuffle index to speed up weight preproc in moe_backend=TRTLLM
#4826
opened Jun 2, 2025 by
rosenrodt
Loading…
[TRTLLM-4987][feat] Support generation logits in TRTLLMSampler
#4819
opened Jun 1, 2025 by
amitz-nv
Loading…
feat: large-scale EP(part 6: Online EP load balancer integration for GB200 nvfp4)
#4818
opened Jun 1, 2025 by
dongxuy04
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.