Skip to content

[UR][HIP] Enable usm pools #17972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 14, 2025
Merged

[UR][HIP] Enable usm pools #17972

merged 3 commits into from
Apr 14, 2025

Conversation

npmiller
Copy link
Contributor

@npmiller npmiller commented Apr 11, 2025

This patch fixes up and enable memory pools for the HIP adapter, it is based on oneapi-src/unified-runtime#1689 and on the CUDA adapter implementation.

The initial patch had segmentation faults in the CI that we couldn't reproduce locally. That happened as well in this patch and I couldn't reproduce the segfaults locally either.

However I noticed that it failed in urUSMHostAlloc, and that entry point was different from the CUDA adapter version, in that the HIP adapter was using a "helper" function. It turns out that the helper function was using a device pool instead of a host pool to do the allocation, which seemed obviously wrong. Replacing the helper by similar code used in the CUDA adapter fixes the crash in the CI.

The pool helper works on device pool and isn't suitable for host
allocations.
@npmiller npmiller marked this pull request as ready for review April 11, 2025 14:44
@npmiller npmiller requested review from a team as code owners April 11, 2025 14:44
Copy link
Contributor

@aarongreig aarongreig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UR CTS change LGTM

The helper was using the device pool for all of the allocations
resulting in crashes.
Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks familiar! Thanks for finding that pesky segfault after all this time.

@npmiller
Copy link
Contributor Author

@intel/llvm-gatekeepers this is ready to merge

@kbenzie kbenzie merged commit 010c310 into intel:sycl Apr 14, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants