Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bench): Update GEMM benchmark for compatibility with the latest codebase changes. #59

Merged
merged 8 commits into from
Feb 20, 2025

Conversation

KuangjuX
Copy link
Collaborator

@KuangjuX KuangjuX commented Feb 20, 2025

Update the GEMM benchmarks to ensure compatibility with the latest changes made in the codebase.

This PR fixed some bugs and supported different `WarpLayout` in GEMMs:
- Fixed **offset recalculation** based on different `WarpReuse` mode.
- Bug fix for several corner-case scenarios.

Tips:
- GMEM -> SMEM:  Uses `kCont` `WarpReuse` for loading data.
- SMEM -> RMEM: `kRowReuse` in MatrixA and `kColReuse` in MatrixB.

For a tensor shape `[M, N, K]`:
- `M` must be multiple of `16 * kWarpRow`.
- `K` must be multiple of both `64 * kWarpCol` and `64 * kWarpRow`.
- `N` must be multiple of `16 * kWarpCol`.
@KuangjuX KuangjuX marked this pull request as draft February 20, 2025 02:03
@KuangjuX KuangjuX marked this pull request as ready for review February 20, 2025 11:06
@KuangjuX KuangjuX requested a review from haruhi55 February 20, 2025 11:07
@lcy-seso lcy-seso self-requested a review February 20, 2025 11:56
@lcy-seso lcy-seso changed the title feat(bench): Add GEMM Benchmark with CUTLASS and CuBLAS Performance Comparison. feat(bench): Fix GEMM benchmark to enable performance comparisons between CUTLASS and CuBLAS Feb 20, 2025
Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lcy-seso lcy-seso changed the title feat(bench): Fix GEMM benchmark to enable performance comparisons between CUTLASS and CuBLAS fix(bench): Update GEMM benchmark for compatibility with the latest codebase changes. Feb 20, 2025
@lcy-seso lcy-seso merged commit 0e6c557 into microsoft:master Feb 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants