Fix sliding sync performance slow down for long lived connections. #19206

erikjohnston · 2025-11-20T13:40:33Z

Fixes #19175

This PR moves tracking of what lazy loaded membership we've sent to each room out of the required state table. This avoids that table from continuously growing, which massively helps performance as we pull out all matching rows for the connection when we receive a request.

The new table is only read when we have data in a room to send, so we end up reading a lot fewer rows from the DB. Though we now read from that table for every room we have events to return in, rather than once at the start of the request.

For an explanation of how the new table works, see the comment on the table schema.

The table is designed so that we can later prune old entries if we wish, but that is not implemented in this PR.

Reviewable commit-by-commit.

We then filter them out before sending to the client, but it is unnecessary to do so and interferes with later changes.

This is so that clients know if they can use a cached `/members` response or not.

This ensures that the set of required state doesn't keep growing as we add and remove member state. We then only load them from the DB when needed, rather than all state for all rooms when we get a request.

It was thinking the table name was `IN`, as it matched `connection_positi(on IS) NULL`.

MadLittleMods

I haven't fully onboarded onto the concept and details to be confident in the approach.

synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql

scripts-dev/check_schema_delta.py

synapse/handlers/sliding_sync/__init__.py

MadLittleMods · 2025-11-21T19:15:45Z

synapse/handlers/sliding_sync/__init__.py

+    Attributes:
+        required_state_map_change: The updated required state map to store in
+            the room config, or None if there is no change.
+        added_state_filter: The state filter to use to fetch any additional
+            current state that needs to be returned to the client.
+        lazy_members_previously_returned: The set of user IDs we should add to
+            the lazy members cache that we had previously returned.
+        lazy_members_invalidated: The set of user IDs whose membership has
+            changed but we didn't send down, so we need to invalidate them from
+            the cache.


I know this is the standard way for the docstring but some of these attributes are a bit tricky and I'd rather see the docstring when I hover the attributes.

Potential to convert to the """ variant below the property itself

It's really annoying that VSCode doesn't pick these up, it does for functions.

I'm not sure about moving style, especially since its a bit confusing that the docstring for attributes go after the attribute.

I think it does for functions because the spec says it should do that.

For attributes, it feels like unfortunate Python decisions but we get this from https://peps.python.org/pep-0257/ (2001)

String literals occurring immediately after a simple assignment at the top level of a module, class, or __init__ method are called “attribute docstrings”.

And expanded upon in https://peps.python.org/pep-0258/#attribute-docstrings (2001)

A string literal immediately following an assignment statement is interpreted by the docstring extraction machinery as the docstring of the target of the assignment statement, under the following conditions:

Discussed in the backend team lobby.

Seems to be a strong preference to using """ documentation for attributes:

Able to write large docstrings for an attribute without the whole function docstring getting unreasonably large

LSP support > weird Python decisions on placement

synapse/types/handlers/sliding_sync.py

synapse/handlers/sliding_sync/__init__.py

MadLittleMods · 2025-11-21T20:58:46Z

synapse/handlers/sliding_sync/__init__.py

+                            else:
+                                # For non-limited timelines we always return all
+                                # membership changes. This is so that clients
+                                # who have fetched the full membership list
+                                # already can continue to maintain it for
+                                # non-limited syncs.
+                                #
+                                # This assumes that for non-limited syncs there
+                                # won't be many membership changes that wouldn't
+                                # have been included already (this can only
+                                # happen if membership state was rolled back due
+                                # to state resolution anyway).
+                                required_state_types.append((EventTypes.Member, None))


This seems like a bigger behavioral change.

I think this fixes #18782 🤔 - If so, we should add a test.

Ah, did mean to factor that out but it sneaked in as it needs to be accounted for in the lazy loading stuff.

Added with test_lazy_load_state_reset ✅

Actually, this only fixes it for non-limited syncs. I think we should also return state reset membership in limited timeline scenarios as well.

We should at-least leave a FIXME with a link to the issue in the if-block above.

I don't think we want to return all membership changes when it is limited? Only the ones for users that appear in the timeline / required_state?

I'm not sure (we should better clarify these semantics in the MSC once we decide and have reasoning).

~~If we've previously sent down the membership, feels like we should give them an update.~~

But to compare with a normal membership update in the limited scenario, it would only be relevant if it was part of the timeline.

So I guess the same would apply. If the state reset/rollback happened in the timeline range, we should give an update. Instead of trying to figure out that intricacy (although we do have delta.stream_id if we wanted to), we could just always assume that state rollbacks are relevant. Check for delta.event_type == EventTypes.Member and delta.event_id is None

…iously_returned in tests

Co-authored-by: Eric Eastwood <[email protected]>

When fetching previously sent lazy members we didn't filter by room, which meant that we didn't send down member events in a room if we'd previously sent that user's member event in another room.

…embership_storage2

scripts-dev/check_schema_delta.py

synapse/storage/databases/main/sliding_sync.py

synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql

MadLittleMods · 2025-11-26T17:09:31Z

synapse/handlers/sliding_sync/__init__.py

+    Attributes:
+        required_state_map_change: The updated required state map to store in
+            the room config, or None if there is no change.
+        added_state_filter: The state filter to use to fetch any additional
+            current state that needs to be returned to the client.
+        lazy_members_previously_returned: The set of user IDs we should add to
+            the lazy members cache that we had previously returned.
+        lazy_members_invalidated: The set of user IDs whose membership has
+            changed but we didn't send down, so we need to invalidate them from
+            the cache.


I think it does for functions because the spec says it should do that.

For attributes, it feels like unfortunate Python decisions but we get this from https://peps.python.org/pep-0257/ (2001)

String literals occurring immediately after a simple assignment at the top level of a module, class, or __init__ method are called “attribute docstrings”.

And expanded upon in https://peps.python.org/pep-0258/#attribute-docstrings (2001)

A string literal immediately following an assignment statement is interpreted by the docstring extraction machinery as the docstring of the target of the assignment statement, under the following conditions:

synapse/storage/databases/main/sliding_sync.py

tests/handlers/test_sliding_sync.py

synapse/handlers/sliding_sync/__init__.py

tests/handlers/test_sliding_sync.py

…embership_storage2

Co-authored-by: Eric Eastwood <[email protected]>

Currently we always persist a new position when using lazy loading, which is needless.

MadLittleMods · 2025-12-03T19:42:33Z

synapse/handlers/sliding_sync/__init__.py

            if prev_room_sync_config is not None:
+                # Define `required_user_state` as all user state we want, which
+                # is the explicitly requested members, any needed for lazy
+                # loading, and users whose membership has changed.s


Suggested change

# loading, and users whose membership has changed.s

# loading, and users whose membership has changed.

MadLittleMods · 2025-12-03T19:51:40Z

synapse/handlers/sliding_sync/__init__.py

+                            else:
+                                # For non-limited timelines we always return all
+                                # membership changes. This is so that clients
+                                # who have fetched the full membership list
+                                # already can continue to maintain it for
+                                # non-limited syncs.
+                                #
+                                # This assumes that for non-limited syncs there
+                                # won't be many membership changes that wouldn't
+                                # have been included already (this can only
+                                # happen if membership state was rolled back due
+                                # to state resolution anyway).
+                                required_state_types.append((EventTypes.Member, None))


I'm not sure (we should better clarify these semantics in the MSC once we decide and have reasoning).

~~If we've previously sent down the membership, feels like we should give them an update.~~

But to compare with a normal membership update in the limited scenario, it would only be relevant if it was part of the timeline.

So I guess the same would apply. If the state reset/rollback happened in the timeline range, we should give an update. Instead of trying to figure out that intricacy (although we do have delta.stream_id if we wanted to), we could just always assume that state rollbacks are relevant. Check for delta.event_type == EventTypes.Member and delta.event_id is None

MadLittleMods · 2025-12-03T20:44:01Z

synapse/handlers/sliding_sync/__init__.py

                    state_filter=StateFilter.from_types(hero_room_state),
                    to_token=to_token,
                )
-                room_state.update(hero_membership_state)


I think this is good as-is ⏩

This was just redundant data that the client already had.

synapse/storage/databases/main/sliding_sync.py

MadLittleMods · 2025-12-03T20:49:41Z

synapse/handlers/sliding_sync/__init__.py

+                # Normalize to proper user ID
+                state_key = user_id
+
+            # We remember the user if either they haven't been invalidated


Suggested change

# We remember the user if either they haven't been invalidated

# We remember the user if they haven't been invalidated

No longer an "either a) or b)" scenario since #19206 (comment)

MadLittleMods · 2025-12-03T23:22:32Z

tests/rest/client/sliding_sync/test_rooms_required_state.py

        # down.
        self.assertIsNone(response_body["rooms"][room_id1].get("required_state"))
+
+    def test_lazy_loaded_last_seen_ts(self) -> None:


Suggested change

def test_lazy_loaded_last_seen_ts(self) -> None:

def test_lazy_loading_room_members_last_seen_ts(self) -> None:

MadLittleMods · 2025-12-03T23:22:46Z

tests/rest/client/sliding_sync/test_rooms_required_state.py

+            exact=True,
+        )
+
+    def test_lazy_members_forked_position(self) -> None:


Suggested change

def test_lazy_members_forked_position(self) -> None:

def test_lazy_loading_room_members_forked_position(self) -> None:

MadLittleMods · 2025-12-03T23:22:54Z

tests/rest/client/sliding_sync/test_rooms_required_state.py

+            exact=True,
+        )
+
+    def test_lazy_members_across_multiple_connections(self) -> None:


Suggested change

def test_lazy_members_across_multiple_connections(self) -> None:

def test_lazy_loading_room_members_across_multiple_connections(self) -> None:

MadLittleMods · 2025-12-03T23:25:37Z

tests/rest/client/sliding_sync/test_rooms_required_state.py

+            exact=True,
+        )
+
+    def test_lazy_members_across_multiple_rooms(self) -> None:


Suggested change

def test_lazy_members_across_multiple_rooms(self) -> None:

def test_lazy_loading_room_members_across_multiple_rooms(self) -> None:

MadLittleMods · 2025-12-03T23:25:45Z

tests/rest/client/sliding_sync/test_rooms_required_state.py

            exact=True,
        )

+    def test_lazy_members_limited_sync(self) -> None:


Suggested change

def test_lazy_members_limited_sync(self) -> None:

def test_lazy_loading_room_members_limited_sync(self) -> None:

erikjohnston added 3 commits November 20, 2025 09:53

Refactor heroes to not be added to room state

fc6000c

We then filter them out before sending to the client, but it is unnecessary to do so and interferes with later changes.

Always return all memberships for non-limited syncs

087f6eb

This is so that clients know if they can use a cached `/members` response or not.

Make _required_state_changes return struct

49fa7eb

erikjohnston force-pushed the erikj/sss_better_membership_storage2 branch from f67e114 to 0d6ccbe Compare November 20, 2025 13:43

erikjohnston added 4 commits November 20, 2025 13:47

Track lazy loaded members in SSS separately.

8cba313

This ensures that the set of required state doesn't keep growing as we add and remove member state. We then only load them from the DB when needed, rather than all state for all rooms when we get a request.

Update tests

6303bb1

Newsfile

5c48983

Fix check delta script

4984858

It was thinking the table name was `IN`, as it matched `connection_positi(on IS) NULL`.

erikjohnston force-pushed the erikj/sss_better_membership_storage2 branch from 0d6ccbe to 4984858 Compare November 20, 2025 13:52

erikjohnston marked this pull request as ready for review November 20, 2025 15:52

erikjohnston requested a review from a team as a code owner November 20, 2025 15:52

MadLittleMods added A-Sync A-Database A-Performance labels Nov 21, 2025

MadLittleMods reviewed Nov 21, 2025

View reviewed changes

erikjohnston and others added 11 commits November 24, 2025 14:12

Rename required_user_state

7a0a8a2

Reword the cache comments on the schema

8a3ec20

Rename RoomLazyMembershipChanges fields

fc01740

Add RoomLazyMembershipChanges last_seen_ts comment

ae3f569

Clean up comments

027b422

Always include lazy_members_previously_returned and lazy_members_prev…

abee4db

…iously_returned in tests

Fixup comment

99855ba

Use duration constants

0b1ecf1

Update tests/handlers/test_sliding_sync.py

2090d14

Co-authored-by: Eric Eastwood <[email protected]>

Rename previously_returned_user_state param

113f6ce

Fix bug where we didn't correctly filter lazy members by room

ec45e00

When fetching previously sent lazy members we didn't filter by room, which meant that we didn't send down member events in a room if we'd previously sent that user's member event in another room.

erikjohnston force-pushed the erikj/sss_better_membership_storage2 branch from fe94608 to ec45e00 Compare November 25, 2025 11:12

erikjohnston added 3 commits November 25, 2025 11:21

Lint

f8f6dc9

Add test for forked position

5604d3a

Ensure that the last_seen_ts is correctly updated

815b852

erikjohnston force-pushed the erikj/sss_better_membership_storage2 branch from 6c2cf0d to 2e844aa Compare November 25, 2025 14:28

erikjohnston added 3 commits November 25, 2025 15:21

Add tests for state reset and lazy loading

deaf995

Merge remote-tracking branch 'origin/develop' into erikj/sss_better_m…

cdeebc8

…embership_storage2

Fix limited sync lazy members

65aebf4

erikjohnston requested a review from MadLittleMods November 26, 2025 15:14

ara4n mentioned this pull request Nov 26, 2025

Major regression in sync time element-hq/element-x-ios#4728

Open

MadLittleMods reviewed Nov 27, 2025

View reviewed changes

erikjohnston and others added 20 commits December 2, 2025 13:16

Merge remote-tracking branch 'origin/develop' into erikj/sss_better_m…

4d4c1b8

…embership_storage2

Use Duration

e6939e7

Fix bug where lazy members were shared between connections

69fc61d

Fixup lazy_members_previously_returned

56ead16

Update synapse/storage/databases/main/sliding_sync.py

2d2047d

Co-authored-by: Eric Eastwood <[email protected]>

Update synapse/storage/databases/main/sliding_sync.py

da08203

Co-authored-by: Eric Eastwood <[email protected]>

Split state_key_expand_lazy_keep_previous_memberships

2546ca6

Update LAZY_MEMBERS_UPDATE_INTERVAL comment

ba59391

Fixup RoomLazyMembershipChanges comment

0a68e12

Fixup returned_user_id_to_last_seen_ts_map docs

45d1bfa

Update synapse/handlers/sliding_sync/__init__.py

4070326

Co-authored-by: Eric Eastwood <[email protected]>

Update tests/rest/client/sliding_sync/test_rooms_required_state.py

e2b4fe8

Co-authored-by: Eric Eastwood <[email protected]>

Update tests/rest/client/sliding_sync/test_rooms_required_state.py

b75b3cb

Co-authored-by: Eric Eastwood <[email protected]>

Move 'lazy_members_previously_returned' definition

6caacd1

Remove spurious lazy load user in test

b1bc509

Don't add to lazy_members_previously_returned what we're lazy loading

7ff3d2f

Add context to state reset comment

008cb58

Note it is a regression test

d3f3f98

Update comment on update ts test

0ffb32a

Only persist lazy members if we need to

17bf341

Currently we always persist a new position when using lazy loading, which is needless.

MadLittleMods added A-3PID Issues affecting third-party identifiers and invites and removed A-3PID Issues affecting third-party identifiers and invites labels Dec 3, 2025

MadLittleMods reviewed Dec 3, 2025

View reviewed changes

	# loading, and users whose membership has changed.s
	# loading, and users whose membership has changed.

	# We remember the user if either they haven't been invalidated
	# We remember the user if they haven't been invalidated

	def test_lazy_loaded_last_seen_ts(self) -> None:
	def test_lazy_loading_room_members_last_seen_ts(self) -> None:

	def test_lazy_members_forked_position(self) -> None:
	def test_lazy_loading_room_members_forked_position(self) -> None:

	def test_lazy_members_across_multiple_connections(self) -> None:
	def test_lazy_loading_room_members_across_multiple_connections(self) -> None:

	def test_lazy_members_across_multiple_rooms(self) -> None:
	def test_lazy_loading_room_members_across_multiple_rooms(self) -> None:

	def test_lazy_members_limited_sync(self) -> None:
	def test_lazy_loading_room_members_limited_sync(self) -> None:

Fix sliding sync performance slow down for long lived connections. #19206

Are you sure you want to change the base?

Fix sliding sync performance slow down for long lived connections. #19206

Conversation

erikjohnston commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MadLittleMods left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MadLittleMods Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

erikjohnston commented Nov 20, 2025 •

edited

Loading

MadLittleMods Nov 21, 2025 •

edited

Loading