-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][UR][CUDA] wrong order of the ~ur_device_handle_t_()
destructor and a user-app static buffer destructor
#17450
Labels
Comments
See: #17411 (comment) |
See the reproduction in CI: https://github.com/intel/llvm/actions/runs/13855853921/job/38773587493?pr=17468 |
DPC++ version: on the PR: #17468 |
aarongreig
added a commit
to aarongreig/intel-llvm
that referenced
this issue
Mar 21, 2025
In the cuda adapter the adapter struct itself is currently an extern global defined in adapter.cpp. This means fully tearing down the adapter is subject to the same destructor ordering as all other static and global variables, it's first in last out. This presents a problem because an application can declare a static sycl object like a buffer right up top before doing anything else, which results in the sycl object being destroyed after the cuda adapter struct. The UR spec doesn't put the onus on users to keep their parent object lifetimes in order, i.e. there is no statement about "the context you use to create a ur_mem_handle_t must not be released until after the mem_handle". It's assumed (by omission rather than explicitly) that adapters will have their objects keep a reference to any parent objects alive for the duration of their own lifetime. This change moves the cuda adapter structs ownership into a global shared_ptr, which allows child objects of the adapter to keep their own references to it alive past the point where its initial definition goes out of scope. Also adjusts how some other objects track parent object references so that the destructors correctly cascade back to the top: mem handle releases its context, which releases its adapter, which releases the platform + devices, etc. Fixes intel#17450
aarongreig
added a commit
to aarongreig/intel-llvm
that referenced
this issue
Mar 21, 2025
In the cuda adapter the adapter struct itself is currently an extern global defined in adapter.cpp. This means fully tearing down the adapter is subject to the same destructor ordering as all other static and global variables, it's first in last out. This presents a problem because an application can declare a static sycl object like a buffer right up top before doing anything else, which results in the sycl object being destroyed after the cuda adapter struct. The UR spec doesn't put the onus on users to keep their parent object lifetimes in order, i.e. there is no statement about "the context you use to create a ur_mem_handle_t must not be released until after the mem_handle". It's assumed (by omission rather than explicitly) that adapters will have their objects keep a reference to any parent objects alive for the duration of their own lifetime. This change moves the cuda adapter structs ownership into a global shared_ptr, which allows child objects of the adapter to keep their own references to it alive past the point where its initial definition goes out of scope. Also adjusts how some other objects track parent object references so that the destructors correctly cascade back to the top: mem handle releases its context, which releases its adapter, which releases the platform + devices, etc. Fixes intel#17450
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Wrong order of the
~ur_device_handle_t_()
destructor and a user-app static buffer destructor.The
ur_device_handle_t_::~ur_device_handle_t_()
destructor of a CUDA device is incorrectly called too early before thesycl::~buffer
destructor of a user-app static buffer is called. As a result the CUDA device is destroyed before a memory allocated from this device is freed and the test segfaults - see a part of a log from thesycl/test-e2e/Regression/static-buffer-dtor.cpp
test:See: #17411 (comment)
Ref: #17411
To reproduce
Include a code snippet that is as short as possible - the
sycl/test-e2e/Regression/static-buffer-dtor.cpp
SYCL test.Specify the command which should be used to compile the program
The sycl/test-e2e/Regression/static-buffer-dtor.cpp test segfaults on the PR #17468
See the reproduction in CI: https://github.com/intel/llvm/actions/runs/13855853921/job/38773587493?pr=17468
Environment
sycl-ls --verbose
]Additional context
This bug can happen if SYCL calls
urAdapterRelease()
before this staticsycl::buffer
is destroyed.The text was updated successfully, but these errors were encountered: