Skip to content

Conversation

@nnethercote
Copy link
Collaborator

@FractalFir identified that rust-cuda implements some intrinsics that are present in core. I also fixed some nearby rough edges.

It has two versions, one with an upper bound, and one with a lower and
upper bound. This commit removes the first one and changes the second
one to take a range, because that is more concise and flexible and
clearer.

Also, rename it as `in_range`, which makes sense given that the bounds
are specified via a Rust `Range`.

Note: some of the ranges are incorrect, and will be fixed in the next
commit.
Every single one has an upper bound that is one higher than it should
be.

- For `thread_idx_[xyz]`: indices are 0-indexed, so the maximum index is
  the `block_dim_[xyz]` maximum minus one. Changing `..=` to `..` fixes
  it.

- For `block_idx_[xyz]`: likewise, but relative to `grid_dim_[xyz]`.

- For `block_dim_[xyz]`: these were all one too big. Not sure why,
  perhaps a `..`/`..=` mix-up?

- For `grid_dim_[xyz]`: likewise. (Yes, these grid maximum dimensions
  are all of the form 2^N-1 even though the block maximum dimensions are
  all of the form 2^N. I don't know why, but it's what the CUDA docs
  say.)
Instead call the Rust functions that have the range constraints. That
way the 3d version get the same range constraints as the 1d versions. It
also avoids the need for some `unsafe` blocks.
`core` has equivalents, might as well use them instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant