You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Block, thread, and warp indices would ideally be unsigned, as in CUDA C/C++. This would:
Reduce computation and register usage (due to handling signs)
Align better with CUDA C/C++
However, due to Numba's typing it can result in float indices being generated rather than int ones (e.g unsigned + signed = float or something like that), or 32-bit values becoming 64-bit ones (as observed in numba/numba#6112 (comment))
This was started in numba/numba#6112 - completion of this PR would be sufficient to implement this feature request, but it may not be as simple as getting the test suite to behave identically - effects may be observed in more complex programs where types change.
An alternative path may be to define a separate type for thread indices that is more resistant to upcasting when used in computations, but the exact solution is unclear.
The text was updated successfully, but these errors were encountered:
This ports numba/numba#6112 to numba-cuda, as outlined in NVIDIA#109.
Note that for this patch, we don't change the type of `grid()` and
`gridsize()` because these need to be 64 bit (as discovered in
numba/numba#9229 and fixed in numba/numba#9235).
We need to patch the `as_dtype()` function, which is a little
unfortunate, but there's no API for extending its behaviour at present.
Block, thread, and warp indices would ideally be unsigned, as in CUDA C/C++. This would:
However, due to Numba's typing it can result in float indices being generated rather than int ones (e.g unsigned + signed = float or something like that), or 32-bit values becoming 64-bit ones (as observed in numba/numba#6112 (comment))
This was started in numba/numba#6112 - completion of this PR would be sufficient to implement this feature request, but it may not be as simple as getting the test suite to behave identically - effects may be observed in more complex programs where types change.
An alternative path may be to define a separate type for thread indices that is more resistant to upcasting when used in computations, but the exact solution is unclear.
The text was updated successfully, but these errors were encountered: