Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST]when will DelayTmaStore be important? #2106

Open
ziyuhuang123 opened this issue Feb 13, 2025 · 2 comments
Open

[QST]when will DelayTmaStore be important? #2106

ziyuhuang123 opened this issue Feb 13, 2025 · 2 comments

Comments

@ziyuhuang123
Copy link

I see DelayTmaStore in the code but I do not understand when we need it. Could anyone tell me? Thanks!

@hwu36
Copy link
Collaborator

hwu36 commented Feb 19, 2025

@Junkai-Wu

@Junkai-Wu
Copy link
Contributor

It's to optimize the dependency between storing to shared memory (ss) and tma storing to global memory (sg) in epilogue. Usually one sg is issued after one ss. If DelayTmaStore is specified, one sg is issued after two ss. It's for performance consideration, sometimes more ss is better fused with alu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants