Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaOccupancyMaxActiveBlocksPerMultiprocessor #1424

Closed
jinz2014 opened this issue Mar 8, 2024 · 3 comments
Closed

cudaOccupancyMaxActiveBlocksPerMultiprocessor #1424

jinz2014 opened this issue Mar 8, 2024 · 3 comments
Assignees
Labels
cuda CUDA adapter specific issues

Comments

@jinz2014
Copy link

jinz2014 commented Mar 8, 2024

Is there a SYCL function for cudaOccupancyMaxActiveBlocksPerMultiprocessor ? some use cases are listed below. Thanks.

AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_adapter.h: result = cudaOccupancyMaxActiveBlocksPerMultiprocessor(
AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_base.h: cudart_result = cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(
AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_base.h: CUTLASS_TRACE_HOST(" cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() returned error " << cudaGetErrorString(cudart_result));
AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/base_grouped.h: result =
AITemplate/3rdparty/cub/cub/device/dispatch/dispatch_radix_sort.cuh: if (CubDebug(error = cudaOccupancyMaxActiveBlocksPerMultiprocessor(
AITemplate/3rdparty/cub/cub/util_device.cuh: return CubDebug(cudaOccupancyMaxActiveBlocksPerMultiprocessor(
AITemplate/python/aitemplate/backend/cuda/groupnorm/layer_norm.cuh: cudaError_t err = cudaOccupancyMaxActiveBlocksPerMultiprocessor(
AITemplate/python/aitemplate/backend/cuda/layernorm_sigmoid_mul/layer_norm.cuh: cudaError_t err = =
AITemplate/python/aitemplate/backend/cuda/softmax/softmax.cuh: cudaOccupancyMaxActiveBlocksPerMultiprocessor(

@kbenzie kbenzie added needs-discussion This needs further discussion cuda CUDA adapter specific issues labels Apr 9, 2024
@npmiller npmiller assigned GeorgeWeb and unassigned npmiller May 15, 2024
@GeorgeWeb
Copy link
Contributor

Hi @jinz2014 . Working on this, I'll ping you when a PR is up. Thank you!

@jinz2014
Copy link
Author

Thanks. This will enable the migration of variants of the function in the SYCL compiler.

@kbenzie kbenzie removed the needs-discussion This needs further discussion label May 21, 2024
@npmiller
Copy link
Contributor

This has been implemented in:

Which should make it possible to get the same information, we're also investigating some SYCL compat changes to make it easier to port from CUDA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA adapter specific issues
Projects
None yet
Development

No branches or pull requests

4 participants