-
Notifications
You must be signed in to change notification settings - Fork 772
[SYCL][UR][L0 v2] Fix issues with event lifetime #18962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
6625711
to
02e0fd3
Compare
Event pools are released to the context when the queue is released. So this would mean that the command buffer outlives the context itself, not just the queue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but please add a test that releases a command buffer and queue before using the command buffer for the last time.
Yes, and we could actually solve this problem by retaining the context in the command buffer, rather than the queue. However, there is a potential problem with the execution event: if the queue associated with that event is destroyed and we want to later synchronize on the execution event we might crash because (as far as I know) counter-based events rely on the associated command-list for keeping some of it's state (and command list is potentially destroyed/re-used as soon as queue is released). I did solve this problem by retaining the queue in command buffer but now as I think about it, it is a more general problem. What if someone wants to pass an event as a dependency to some operation after the associated queue is destroyed? I think we might actually need to start retaining the queue when allocating events (which we have tried to avoid until now). One alternative I'm considering is to do some logic in the event pool. For example, we could create another raii wrapper that would internally store |
:(
I think this might be the best option. I'll think on it. But let's do that as a separate PR. |
6d6a4c7
to
c08ea6f
Compare
Yes, for now, I think we can just merge the fix with context retain/release - this is enough for all the current tests to pass (with ooo queue). |
@intel/sycl-graphs-reviewers could you please take a look? |
When releasing executionEvent in the commandBuffer we need to also retain the queue. Otherwise event->release() in commandBuffer's destructor might attempt to release the event to an event pool that has already been destroyed.
Also, make sure eventPool is destroyed after commandListManager to avoid any issues with events owned by the commandListManager and move context retain/release from commandListManager to queue/command_buffer.
The implementation of commandListManager marked move ctor and assignment operator as defaulted which was wrong - the implementation would have to account for context and device retain/release calls. To avoid this, move the logic to queue/commandBuffer which can have move ctors/assignments operators removed.