Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Make block, thread, and warp indices unsigned #109

Open
gmarkall opened this issue Jan 9, 2025 · 2 comments
Open

[FEA] Make block, thread, and warp indices unsigned #109

gmarkall opened this issue Jan 9, 2025 · 2 comments
Labels
feature request New feature or request

Comments

@gmarkall
Copy link
Collaborator

gmarkall commented Jan 9, 2025

Block, thread, and warp indices would ideally be unsigned, as in CUDA C/C++. This would:

  • Reduce computation and register usage (due to handling signs)
  • Align better with CUDA C/C++

However, due to Numba's typing it can result in float indices being generated rather than int ones (e.g unsigned + signed = float or something like that), or 32-bit values becoming 64-bit ones (as observed in numba/numba#6112 (comment))

This was started in numba/numba#6112 - completion of this PR would be sufficient to implement this feature request, but it may not be as simple as getting the test suite to behave identically - effects may be observed in more complex programs where types change.

An alternative path may be to define a separate type for thread indices that is more resistant to upcasting when used in computations, but the exact solution is unclear.

@gmarkall
Copy link
Collaborator Author

gmarkall commented Jan 9, 2025

I just started going over the original PR and noticed that:

Perhaps the solution is to remove the possible promotion of the tid type to 64 bits, and would be the next thing to investigate.

gmarkall added a commit to gmarkall/numba-cuda that referenced this issue Jan 9, 2025
This ports numba/numba#6112 to numba-cuda, as outlined in NVIDIA#109.

Note that for this patch, we don't change the type of `grid()` and
`gridsize()` because these need to be 64 bit (as discovered in
numba/numba#9229 and fixed in numba/numba#9235).

We need to patch the `as_dtype()` function, which is a little
unfortunate, but there's no API for extending its behaviour at present.
@gmarkall
Copy link
Collaborator Author

gmarkall commented Jan 9, 2025

Draft PR that ports the changes over to Numba-cuda is in #110.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant