Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cell): Support different WarpLayout in GEMMs. #55

Merged
merged 3 commits into from
Feb 18, 2025

Conversation

KuangjuX
Copy link
Collaborator

@KuangjuX KuangjuX commented Feb 17, 2025

This PR fixed some bugs and supported different WarpLayout in GEMMs:

  • Fixed offset recalculation based on different WarpReuse mode.
  • Bug fix for several corner-case scenarios.

Tips:

  • GMEM -> SMEM: Uses kCont WarpReuse for loading data.
  • SMEM -> RMEM: kRowReuse in MatrixA and kColReuse in MatrixB.

For a tensor shape [M, N, K]:

  • M must be multiple of 16 * kWarpRow.
  • K must be multiple of both 64 * kWarpCol and 64 * kWarpRow.
  • N must be multiple of 16 * kWarpCol.

@KuangjuX KuangjuX marked this pull request as draft February 17, 2025 10:32
@haruhi55 haruhi55 changed the title fix(cell): Support different WarpLayout GEMMs. fix(cell): Support different WarpLayout in GEMMs. Feb 18, 2025
@KuangjuX KuangjuX marked this pull request as ready for review February 18, 2025 11:32
@KuangjuX KuangjuX requested a review from haruhi55 February 18, 2025 11:33
Copy link
Collaborator

@haruhi55 haruhi55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@haruhi55 haruhi55 merged commit 55dae51 into microsoft:master Feb 18, 2025
3 checks passed
@KuangjuX KuangjuX deleted the fix_gemm branch February 19, 2025 01:44
KuangjuX added a commit to KuangjuX/TileFusion that referenced this pull request Feb 19, 2025
This PR fixed some bugs and supported different `WarpLayout` in GEMMs:
- Fixed **offset recalculation** based on different `WarpReuse` mode.
- Bug fix for several corner-case scenarios.

Tips:
- GMEM -> SMEM:  Uses `kCont` `WarpReuse` for loading data.
- SMEM -> RMEM: `kRowReuse` in MatrixA and `kColReuse` in MatrixB.

For a tensor shape `[M, N, K]`:
- `M` must be multiple of `16 * kWarpRow`.
- `K` must be multiple of both `64 * kWarpCol` and `64 * kWarpRow`.
- `N` must be multiple of `16 * kWarpCol`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants