Skip to content

[AMD] Improve shared layout for Wmma's operands #7319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 27, 2025

Conversation

leeliu103
Copy link
Contributor

Swizzling is always disabled for Wmma's B operand, it should be disabled only when k dimension is not contiguous.

Both vectorSize, perPhase and maxPhase are now determined using a heuristic approach.

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because this PR only updates the shared layout for Wmma's operand, and it's applicable across various cases.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

} else {
// Do not swizzle in case k dimension is not innermost.
// In this case accesses will go in different banks even without swizzling.
int kDimIndex = dotOpEnc.getOpIdx() == 0 ? 1 : 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are here can we do something like MFMA layout in the above to use a helper function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean put the logic into some helper function like composeSharedLayoutForOperand?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refractored, also add @zhanglx13 @joviliast to review

Swizzling is always disabled for Wmma's B operand, it should be
disabled only when k dimension is not contiguous.

Both vectorSize, perPhase and maxPhase are now determined using
a heuristic approach.
@antiagainst antiagainst marked this pull request as ready for review June 26, 2025 18:52
@antiagainst antiagainst requested a review from ptillet as a code owner June 26, 2025 18:52
@zhanglx13 zhanglx13 merged commit 21d2ef2 into triton-lang:main Jun 27, 2025
15 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants