Investigate performance when using OpSubgroupShuffleUpINTEL for SYCL shuffles

The SPIR-V `OpSubgroupShuffleUpINTEL` (and `OpSubgroupShuffleDownINTEL`)  has more functionality than required to implement the SYCL shuffles, which leads to unnecessary complexity.

It would be interesting to see the potential performance hit and to see how it could be optimized.

At the moment this is used for the HIP target, for NVidia the NVidia built-ins are directly used instead of the SPIR-V operation, doing the same thing for AMD may also be beneficial for performance.

It is unclear if this would have a significant impact but it should be investigated.

This was discussed on:
* https://github.com/intel/llvm/pull/5359#discussion_r789775554

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate performance when using OpSubgroupShuffleUpINTEL for SYCL shuffles #5364

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate performance when using OpSubgroupShuffleUpINTEL for SYCL shuffles #5364

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions