You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve documentation around streams, particularly ID generators and adding new streams. (#18943)
This arises mostly from my recent experience adding a stream for Thread
Subscriptions
and trying to help others add their own streams.
---------
Signed-off-by: Olivier 'reivilibre <[email protected]>
A stream is an append-only log `T1, T2, ..., Tn, ...` of facts[^1] which grows over time.
25
25
Only "writers" can add facts to a stream, and there may be multiple writers.
@@ -47,7 +47,7 @@ But unhappy cases (e.g. transaction rollback due to an error) also count as comp
47
47
Once completed, the rows written with that stream ID are fixed, and no new rows
48
48
will be inserted with that ID.
49
49
50
-
###Current stream ID
50
+
## Current stream ID
51
51
52
52
For any given stream reader (including writers themselves), we may define a per-writer current stream ID:
53
53
@@ -93,7 +93,7 @@ Consider a single-writer stream which is initially at ID 1.
93
93
| Complete 6 | 6 ||
94
94
95
95
96
-
###Multi-writer streams
96
+
## Multi-writer streams
97
97
98
98
There are two ways to view a multi-writer stream.
99
99
@@ -115,7 +115,7 @@ The facts this stream holds are instructions to "you should now invalidate these
115
115
We only ever treat this as a multiple single-writer streams as there is no important ordering between cache invalidations.
116
116
(Invalidations are self-contained facts; and the invalidations commute/are idempotent).
117
117
118
-
###Writing to streams
118
+
## Writing to streams
119
119
120
120
Writers need to track:
121
121
- track their current position (i.e. its own per-writer stream ID).
@@ -133,7 +133,7 @@ To complete a fact, first remove it from your map of facts currently awaiting co
133
133
Then, if no earlier fact is awaiting completion, the writer can advance its current position in that stream.
134
134
Upon doing so it should emit an `RDATA` message[^3], once for every fact between the old and the new stream ID.
135
135
136
-
###Subscribing to streams
136
+
## Subscribing to streams
137
137
138
138
Readers need to track the current position of every writer.
139
139
@@ -146,10 +146,44 @@ The `RDATA` itself is not a self-contained representation of the fact;
146
146
readers will have to query the stream tables for the full details.
147
147
Readers must also advance their record of the writer's current position for that stream.
148
148
149
-
# Summary
149
+
##Summary
150
150
151
151
In a nutshell: we have an append-only log with a "buffer/scratchpad" at the end where we have to wait for the sequence to be linear and contiguous.
152
152
153
+
---
154
+
155
+
## Cheatsheet for creating a new stream
156
+
157
+
These rough notes and links may help you to create a new stream and add all the
158
+
necessary registration and event handling.
159
+
160
+
**Create your stream:**
161
+
-[create a stream class and stream row class](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/replication/tcp/streams/_base.py#L728)
162
+
- will need an [ID generator](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/storage/databases/main/thread_subscriptions.py#L75)
163
+
- may need [writer configuration](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/config/workers.py#L177), if there isn't already an obvious source of configuration for which workers should be designated as writers to your new stream.
164
+
- if adding new writer configuration, add Docker-worker configuration, which lets us configure the writer worker in Complement tests: [[1]](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/docker/configure_workers_and_start.py#L331), [[2]](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/docker/configure_workers_and_start.py#L440)
165
+
- most of the time, you will likely introduce a new datastore class for the concept represented by the new stream, unless there is already an obvious datastore that covers it.
166
+
- consider whether it may make sense to introduce a handler
-[`process_replication_position` of your appropriate datastore](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/storage/databases/main/thread_subscriptions.py#L111)
173
+
- don't forget the super call
174
+
175
+
**If you're going to do any caching that needs invalidation from new rows:**
176
+
- add invalidations to [`process_replication_rows` of your appropriate datastore](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/storage/databases/main/thread_subscriptions.py#L91)
177
+
- don't forget the super call
178
+
- add local-only [invalidations to your writer transactions](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/storage/databases/main/thread_subscriptions.py#L201)
179
+
180
+
**For streams to be used in sync:**
181
+
- add a new field to [`StreamToken`](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/types/__init__.py#L1003)
182
+
- add a new [`StreamKeyType`](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/types/__init__.py#L999)
183
+
- add appropriate wake-up rules
184
+
- in [`on_rdata`](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/replication/tcp/client.py#L260)
185
+
- locally on the same worker when completing a write, [e.g. in your handler](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/handlers/thread_subscriptions.py#L139)
186
+
- add the stream in [`bound_future_token`](https://github.com/element-hq/synapse/blob/4367fb2d078c52959aeca0fe6874539c53e8360d/synapse/streams/events.py#L127)
0 commit comments