Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Utilizing both Tensor Cores and Cuda Cores, Possible to overlay GEMM calls? #2117

Open
zzhou292 opened this issue Feb 17, 2025 · 3 comments

Comments

@zzhou292
Copy link

What is your question?
Short but dumb questions:

If I understand things correctly, when launching gemm with OpClassTensorOp, cuda cores are idling; if launching gemm with OpClassSimt, Tensor cores are idling. So, is it possible to overlay gemm launch like cudastream in ordinary cuda programming to utilize both the compute unit?

In simple words, concurrently, can we launch a gemm with OpClassSimt and another gemm with OpClassTensorOp (they work on different matrices), and execute both of them at the same time using Tensor cores and CUDA cores?

Thanks in advance for your reply!

@thakkarV
Copy link
Collaborator

For many complex architectural reasons, no not really

@zzhou292
Copy link
Author

Thanks for the reply @thakkarV .

One quick follow-up, is this concurrent execution on Cuda cores and Tensor cores not possible for cutlass for now or is it generally speaking not possible? (can we explicitly program a kernel to use wmma instructions to achieve this?)

Thanks again!!!

@thakkarV
Copy link
Collaborator

This is not a CUTLASS limitation. you can in theory write a CUTLASS kernel that does both. It just does not make sense to issue SIMT FMAs while also issuing tensor core MMAs. We of course issue other types of SIMT instructions and interleave them with tensor cores, but for the purposes of your question of using FMA and MMA at the same time, there is not much of a point in pursuing in that within or outside CUTLASS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants