✨Collaboration long polling fallback #517

AntoLC · 2024-12-18T15:16:56Z

Purpose

Some users have their websockets blocked, so they cannot collaborate.
If they are connected with other collaborators at the same time, it will create constant conflict in the document.

Proposal

We have succeeded to propose an experience almost as good as with websocket.

We will use a http fallback when the websocket is not able to connect.
We are still using the Hocus Pocus mechanism, so push and pull are trigger by the Hocus Pocus provider and server.
By using the Hocus Pocus mechanism, we are still using y-protocols/sync making our request very light (a few bytes).
We are using the SSE (https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
to pull data, to minimize the requests amount and keep as much as possible our documents sync between each others.

Cases we solved:

connect even without websockets users altogether
keep rights (can edit / can view) by using the same mechanism as with the WS
keep the awareness (cursor), sync and doc update
keep our requests light
add a nginx auth cache system - query the backend 1 time every 30 seconds
test what I could

Architecture

flowchart TD
    title1[WebSocket Success]-->Client1(Client)<--->|WebSocket Success|WS1(Websocket) --> Nginx1(Ngnix) <--> Auth1("Auth Sub Request (Django)") --->|With the good right|YServer1("Hocus Pocus Server")
  YServer1 --> WS1
  YServer1 <--> clients(Dispatch to clients)
  title2[WebSocket Fails - Push data]-->Client2(Client)---|WebSocket fails|HTTP2(HTTP) --> Nginx2(Ngnix) <--> Auth2("Auth Sub Request (Django)")--->|With the good right|Express2(Express) --> YServer2("Hocus Pocus Server") --> clients(Dispatch to clients)
  title3[WebSocket Fails - Pull data]-->Client3(Client)<--->|WebSocket fails|SSE(SSE) --> Nginx3(Ngnix) <--> Auth3("Auth Sub Request (Django)") --->|With the good right|Express3(Express) --> YServer3("Listen Hocus Pocus Server")
  YServer3("Listen Hocus Pocus Server") --> SSE
  YServer3("Listen Hocus Pocus Server") <--> clients(Data from clients)

virgile-dev · 2024-12-29T10:03:34Z

Great job @AntoLC !

YousefED

@AntoLC Nice you got this working. If I see it correctly, you created a new endpoint over which you're always syncing the entire Y.Doc to and from the server.

If I'm not mistaken, normally the Y.js sync protocol is more efficient than this and syncs the exact updates required. What's the reason you went for this approach (new endpoint, syncing entire doc) instead of the proxy approach? I think the proxy approach has some potential advantages:

We can keep the same sync protocol, but just switch to a different transport (more efficient and awareness would still work)
The HocusPocus side can stay the same, our "fix" would be isolated to a separate layer) (less code complexity and smaller chance of bugs or security issues)

I might be missing some advantages of your current approach, but my concern is mainly that it adds more "custom code" that's another surface we need to test, maintain and secure. The proxy approach would isolate / limit this more I think

src/frontend/apps/impress/src/core/config/hooks/useCollaborationUrl.tsx

AntoLC · 2025-02-14T16:12:22Z

You can test this PR before it is merged on https://docs-ia.beta.numerique.gouv.fr/.
To deactivate the websocket add the query param withoutWS=true

Example public doc: https://docs-ia.beta.numerique.gouv.fr/docs/481a9933-3514-4aeb-9877-c21be1388877/?withoutWS=true

lunika · 2025-02-17T12:00:09Z

docker/files/etc/nginx/conf.d/default.conf


    location /collaboration-auth {
+        proxy_cache auth_cache;
+        proxy_cache_key "$http_authorization";


Maybe add something more specific to avoid sharing the same cache key later with an other location

Suggested change

proxy_cache_key "$http_authorization";

proxy_cache_key "$http_authorization$request_uri";

Yes you'r totally right

lunika · 2025-02-17T12:00:40Z

src/helm/impress/values.yaml

  ## @param ingressCollaborationWS.annotations.nginx.ingress.kubernetes.io/proxy-send-timeout
  ## @param ingressCollaborationWS.annotations.nginx.ingress.kubernetes.io/upstream-hash-by
  annotations:
+    nginx.ingress.kubernetes.io/auth-cache-key: "$http_authorization"


Same here ? "$http_authorization$request_uri";

The environment was missing in the Sentry configuration. This commit adds the environment to the Sentry configuration.

We can now interact with the collaboration server using http requests. It will be used as a fallback when the websocket is not working. 2 kind of requests: - to send messages to the server we use POST requests - to get messages from the server we use a GET request using SSE (Server Sent Events)

We will need toBase64 in different features, better to move it to "doc-management".

Create the CollaborationProvider class. This class is inherited from HocuspocusProvider class. This class integrate a fallback mechanism to handle the case where the user cannot connect with websockets. It will use post request to send the data to the collaboration server. It will use an EventSource to receive the data from the collaboration server.

We adapt the nginx configuration to works with http requests and on the collaboration routes. Requests are light but quite network intensive, so we add a cache system above "collaboration-auth". It means the backend will be called only once every 30 seconds after a 200 response.

Firefox with websocket Other without

Documentation to describe the collaboration architecture in the project.

YousefED · 2025-02-25T13:19:33Z

src/frontend/apps/impress/src/features/docs/doc-management/libs/CollaborationProvider.ts

+        } as MessageEvent);
+      }
+
+      if (updatedDoc64) {


Why do we have our own logic to apply the message? Isn't it enough (and easier), to let the onMessage function handle this for all cases? Or is there a reason that doesn't work?

Zoomed out; I'm a little concerned by all the manual Yjs operations you have to do. I was hoping you could just use the existing sync-protocol (and code that handles that), but only over a different transport layer. My guess is you ran into issues with this and came up with some workarounds? That does make the code a little more difficult to review (especially when I don't have the context of why which workarounds are necessary).

YousefED · 2025-02-25T13:09:12Z

src/frontend/apps/impress/src/features/docs/doc-management/libs/CollaborationProvider.ts

+  /**
+   * Sync the document with the server.
+   *
+   * In some rare cases, the document may be out of sync.


Similar to above, could you also call the existing HP forceSync and let the Yjs sync protocol handle this? passing Yjs documents and updates around seems a little dangerous to me (at least it's difficult for me to verify if this is correct or not)

YousefED · 2025-02-25T13:14:25Z

src/frontend/apps/impress/src/features/docs/doc-management/libs/CollaborationProvider.ts

+   * Sent to the server the message to
+   * be sent to the other users
+   */
+  public async onPollOutgoingMessage({ message }: onOutgoingMessageParameters) {


I got confused at some parts when checking the code, I think it might be helpful to explain or improve the naming a little bit.

Technically this method is not "polling" anymore, right? When you're polling, you retrieve an update from somewhere, but here we're just sending a message, like a regular POST request.

Also, you're not using Long-Polling at all, because long-polling means (afaik) that you send a request to the server, which keeps the connection open until the server has something interesting to send to you. I don't think this is the case, as you're using SSE for events from server -> client.

Which is fine, but maybe better to avoid the term long-polling then to avoid confusion down the line

AntoLC · 2025-04-10T11:54:30Z

We will close it for now as it didn't work on the target users.

AntoLC added feature add a new feature collaboration labels Dec 18, 2024

AntoLC self-assigned this Dec 18, 2024

This was linked to issues Dec 18, 2024

When no websockets and multi-editors switch user rights to read-only #486

Closed

Collaboration when websockets disable overides content #456

Closed

Switch from Web Sockets to Long polling #437

Closed

AntoLC changed the title ~~✨Collab long polling~~ ✨Collaboration long polling fallback Dec 18, 2024

AntoLC force-pushed the feature/collab-long-polling branch 3 times, most recently from 1360973 to 55238a7 Compare December 23, 2024 16:18

AntoLC changed the base branch from main to refacto/collaboration-process December 23, 2024 16:19

AntoLC mentioned this pull request Dec 23, 2024

♻️Collaboration process #528

Merged

4 tasks

Base automatically changed from refacto/collaboration-process to main December 24, 2024 11:29

AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from 137c5b1 to ff343ca Compare December 24, 2024 15:21

AntoLC marked this pull request as ready for review December 24, 2024 15:21

AntoLC requested a review from YousefED December 24, 2024 15:25

AntoLC force-pushed the feature/collab-long-polling branch from ff343ca to d95e892 Compare December 24, 2024 15:32

YousefED reviewed Jan 7, 2025

View reviewed changes

src/frontend/apps/impress/src/core/config/hooks/useCollaborationUrl.tsx Outdated Show resolved Hide resolved

AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from b24b01c to 3eb9f69 Compare January 21, 2025 14:37

AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from b8ff4ad to c64f1f2 Compare February 14, 2025 15:58

AntoLC force-pushed the feature/collab-long-polling branch from c64f1f2 to d26da26 Compare February 14, 2025 16:08

AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from c096f35 to 56f9a00 Compare February 14, 2025 19:45

AntoLC requested review from lunika and YousefED February 14, 2025 19:46

lunika reviewed Feb 17, 2025

View reviewed changes

AntoLC force-pushed the feature/collab-long-polling branch from 56f9a00 to 8a27a29 Compare February 20, 2025 10:18

AntoLC added 10 commits February 20, 2025 11:20

🔧(y-provider) add missing Sentry environment

6d0ccb1

The environment was missing in the Sentry configuration. This commit adds the environment to the Sentry configuration.

🚚(frontend) move toBase64

f27e968

We will need toBase64 in different features, better to move it to "doc-management".

for-testing

1689047

Firefox with websocket Other without

fixup! 🔧(ngnix) adapt nginx development

1538915

fixup! 🔧(helm) adapt helm nginx

29a4147

📝(documentation) add collaboration architecture doc

f716c49

Documentation to describe the collaboration architecture in the project.

AntoLC force-pushed the feature/collab-long-polling branch 2 times, most recently from 4730321 to f716c49 Compare February 20, 2025 10:21

YousefED reviewed Feb 25, 2025

View reviewed changes

AntoLC closed this Apr 10, 2025

AntoLC added experiment keep track Label to keep track of interesting things labels Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨Collaboration long polling fallback #517

✨Collaboration long polling fallback #517

Uh oh!

AntoLC commented Dec 18, 2024 •

edited

Loading

Uh oh!

virgile-dev commented Dec 29, 2024

Uh oh!

YousefED left a comment

Uh oh!

Uh oh!

AntoLC commented Feb 14, 2025

Uh oh!

lunika Feb 17, 2025

Uh oh!

AntoLC Feb 20, 2025

Uh oh!

lunika Feb 17, 2025

Uh oh!

YousefED Feb 25, 2025

Uh oh!

YousefED Feb 25, 2025

Uh oh!

YousefED Feb 25, 2025

Uh oh!

AntoLC commented Apr 10, 2025

Uh oh!

Uh oh!

	proxy_cache_key "$http_authorization";
	proxy_cache_key "$http_authorization$request_uri";

✨Collaboration long polling fallback #517

✨Collaboration long polling fallback #517

Uh oh!

Conversation

AntoLC commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Proposal

Cases we solved:

Architecture

Uh oh!

virgile-dev commented Dec 29, 2024

Uh oh!

YousefED left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AntoLC commented Feb 14, 2025

Uh oh!

lunika Feb 17, 2025

Choose a reason for hiding this comment

Uh oh!

AntoLC Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

lunika Feb 17, 2025

Choose a reason for hiding this comment

Uh oh!

YousefED Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

YousefED Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

YousefED Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

AntoLC commented Apr 10, 2025

Uh oh!

Uh oh!

AntoLC commented Dec 18, 2024 •

edited

Loading