Skip to content

Conversation

jiafuzha
Copy link

@jiafuzha jiafuzha commented Sep 18, 2025

supported below four fbgemm ops on xpu.

fbgemm::asynchronous_complete_cumsum
fbgemm::jagged_to_padded_dense_forward
fbgemm::jagged_to_padded_dense
fbgemm::dense_to_jagged_forward
fbgemm::jagged_dense_elementwise_add_jagged_output

Please make sure you have below env vars set correctly for running the UT.

# make sure ONEAPI_ROOT is set since it's referenced in umf's vars.sh. Otherwise, you may not able to see any device.
export ONEAPI_ROOT=.../intel/oneapi/
# DPCPP 2025.3
source .../DPCPP/env/vars.sh
source ~/intel/oneapi/mkl/latest/env/vars.sh
source .../pti_0.12/env/vars.sh
source .../umf/1.0.2/env/vars.sh
export BUILD_SEPARATE_OPS=ON
export BUILD_WITH_CPU=ON
export TORCH_XPU_ARCH_LIST='pvc'
export USE_PTI=ON
export USE_KINETO=ON
export USE_XETLA=OFF

@jiafuzha jiafuzha changed the title fbgemm async complete cumsum op, jagged and dense conversion ops fbgemm async complete cumsum op, jagged and dense conversion ops, jagged_dense_elementwise_add_jagged_output op Sep 22, 2025
@jiafuzha jiafuzha changed the title fbgemm async complete cumsum op, jagged and dense conversion ops, jagged_dense_elementwise_add_jagged_output op fbgemm async complete cumsum op, jagged and dense conversion ops, jagged_dense_elementwise_add_jagged_output, reorder batched lengths and indices op Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant