Do the "preprocessing" right for PyTorch compiled grouped GEMM #513

alexsamardzic · 2025-10-03T16:43:49Z

No description provided.

alexsamardzic · 2025-10-03T16:50:37Z

Currently, torch_compile_grouped_gemm and preprocessed_pt2_triton_grouped_mm are the same.

I think there is no point to benchmark torch_compile_grouped_gemm, as it takes into account both the auto-tuning and "preprocessing" arguments. On the other hand, with the change in this PR, preprocessed_pt2_triton_grouped_mm is on par with preprocessed_aten_grouped_mm, which is expected.

(I believe the point about auto-tuning holds for triton_grouped_gemm too.)

xuzhao9 · 2025-10-03T18:03:29Z

cc @NikhilAPatel, can you help take a look?

meta-cla bot added the cla signed label Oct 3, 2025

Do the "preprocessing" right for PyTorch compiled grouped GEMM

41745ba

xuzhao9 requested a review from NikhilAPatel October 3, 2025 17:48

alexsamardzic temporarily deployed to docker-s3-upload October 3, 2025 17:48 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do the "preprocessing" right for PyTorch compiled grouped GEMM #513

Do the "preprocessing" right for PyTorch compiled grouped GEMM #513

Uh oh!

alexsamardzic commented Oct 3, 2025

Uh oh!

alexsamardzic commented Oct 3, 2025

Uh oh!

xuzhao9 commented Oct 3, 2025

Uh oh!

Uh oh!

Do the "preprocessing" right for PyTorch compiled grouped GEMM #513

Are you sure you want to change the base?

Do the "preprocessing" right for PyTorch compiled grouped GEMM #513

Uh oh!

Conversation

alexsamardzic commented Oct 3, 2025

Uh oh!

alexsamardzic commented Oct 3, 2025

Uh oh!

xuzhao9 commented Oct 3, 2025

Uh oh!

Uh oh!