-
Notifications
You must be signed in to change notification settings - Fork 404
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[JAX] WAR for CuDNN MXFP8 norm incorrect result
#1700
opened Apr 18, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[JAX] Distributed Current Scaling
#1699
opened Apr 18, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[PyTorch] Bunch of memory management fixes
#1686
opened Apr 15, 2025 by
pggPL
Loading…
8 of 13 tasks
[JAX] Add collective GEMM without compute/communication overlap
#1675
opened Apr 11, 2025 by
philipphack
Loading…
1 of 6 tasks
[JAX] GroupedQuantizer and GroupedScaledTensor
#1666
opened Apr 10, 2025 by
phu0ngng
Loading…
7 of 13 tasks
Improved performance of mxfp8 cast kernels
performance
Performance issues
#1628
opened Mar 31, 2025 by
Oleg-Goncharov
Loading…
5 of 13 tasks
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 2 – features
#1613
opened Mar 25, 2025 by
pggPL
Loading…
7 tasks done
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 3 – tests
#1612
opened Mar 25, 2025 by
pggPL
Loading…
7 of 13 tasks
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 4 – documentation
#1611
opened Mar 25, 2025 by
pggPL
Loading…
7 tasks done
[JAX] Unbalanced Context Parallelism with THD format
#1565
opened Mar 12, 2025 by
zlsh80826
Loading…
8 of 13 tasks
Split wgrad&dgrad from backward() to support a2a overlap
#1564
opened Mar 12, 2025 by
lhb8125
Loading…
1 of 13 tasks
Enable AttnFuncWithCPAndKVP2P to support mla
#1561
opened Mar 12, 2025 by
SuperCB
Loading…
3 of 13 tasks
change softmax_lse correction of CP to FP32
#1546
opened Mar 7, 2025 by
xrennvidia
Loading…
6 of 13 tasks
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.