-
Notifications
You must be signed in to change notification settings - Fork 95
fix: delay processing of localTrackPublished event because of race condition #546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please write what the root cause is, why this is only an issue only with bun and why setImmediate
solved it?
} else if (ev.case == 'localTrackPublished') { | ||
const publication = this.localParticipant!.trackPublications.get(ev.value.trackSid!); | ||
this.emit(RoomEvent.LocalTrackPublished, publication!, this.localParticipant!); | ||
setImmediate(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a comment on why we use setImmediate
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the root cause is, that the localTrackPublished
event is processed before this.localParticipant.trackPublications
has the track in the map.
This happens, because publishTrack
in participant.ts
sends the publishTrack
request to the FfiClient. Then it waits for the publishTrack
event from the FfiClient with the result. The result is used to set the track in the trackPublications
map.
Right after the publishTrack
event is emitted by the FfiClient, the localTrackPublished
event is also emitted. Since the publishTrack
function in participant.ts
is async and awaits the arrival of the publishTrack
event, it is not guaranteed that this code will resume before the event loop has finished with other things, like calling the localTrackPublished
handler.
By using setImmediate() in the localTrackPublished
handler, we ensure that the event loop has processed the awaited promises before we try to access the trackPublications
map.
Since I don't see the whole picture, this might be solved by not relying on execution order in an asynchronous scenario like it is the case, and refactoring might be a better solution. But this is maybe a bigger thing and using setImmediate
is a quick fix for the current version.
This is a problem with Bun currently, because it seems to process the event loop differently. But since there is no guarantee, this could break in a future version of NodeJS, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using setImmediate is a quick fix for the current version
this is the only thing I'm a bit concerned about. It doesn't really address the underlying issue that what we'd want to be doing is consecutively process each incoming event so that execution order is guaranteed.
By setImmediate
ing one of them, we just potentially kick the can down the road
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a general problem the way events are handled with waitFor
. The event handler should directly perform what is needed, not just resolve a promise that causes another function to continue, because the execution order is not guaranteed for that. One way to handle this maybe is to change waitFor
in publishTrack
to an event handler that adds the track
to trackPublications
.
would love to get this in, even in a partial fix state because we run into this error all the time and would be amazing to get it cleaned up while working out how to properly do events one after the other. |
I opened #547 to address this more generally, could you try with these changes and report if it works as expected for you? |
if this was directed at me then I can't really test that because it happens in production usually and not locally. |
Maybe fixes #462 and #472.
Without this, I cannot start an agent using Bun runtime.