-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running update concurrently #39
Comments
These are very good points @fcollonval, thanks for pointing that out. |
I would go even further than what @fcollonval is describing. I don't think that we even need to gather at all between updates to the peers and saving to the backend. We could do: async for message in websocket:
await self._process_queue.put(message)
await self._broadcast_queue.put(message) and then have two completely independent asyncio tasks,
In doing so, we will still prevent one of the two tasks from stopping if the other one is (momentarily) choking on updates and having a longer queue. Another thing we could experiment with is to synchronously use |
Great discussion! I like the idea of @SylvainCorlay of having two async queues for processing and broadcasting the messages. Another concern that I have is the presence of blocking calls in the server or server extensions. Those would quickly slow everything down. We should probably spend some time profiling the server to understand where blocking calls are happening. |
It could be interesting to compare with Jupyverse. In particular, the file ID service will be fully async, while |
Yet another alternative to @SylvainCorlay's proposal is to just create background tasks for processing messages, to update our internal state and to broadcast to clients. They would be like "fire and forget" tasks. |
I made these changes part of #38. |
In @SylvainCorlay's proposal, a task consumes the |
Would |
|
OK, I did not know that |
The publication of update is done task at a time:
ypy-websocket/ypy_websocket/websocket_server.py
Lines 113 to 123 in 357d2a6
We should do those update concurrently in a
asyncio.gather
to get them as quickly as possible and not be stuck by a slow client.Moreover we should update with higher priority the document because it is the reference to be uploaded by any new clients before receiving deltas.
I think that part of the code is partly responsible for a data lost case seen when 20 people were collaborating simultaneously. What happen is some clients (cannot know if all) were still receiving the deltas. But the file on disk (that is a regular dump of the in-memory ydoc) stops updated. And if a new client connects, the document at the latest dump version was the one pushed.
My best guess (unfortunately we did not see any error) is that one client was blocking and so the code after the client loop was never executed.
cc: @davidbrochart @hbcarlos
The text was updated successfully, but these errors were encountered: