|
2 | 2 |
|
3 | 3 | <h1 align="center">Event-Reduce</h1>
|
4 | 4 | <p align="center">
|
5 |
| - <strong>An optimisation algorithm to speed up database queries that run multiple times</strong> |
| 5 | + <strong>An algorithm to optimize database queries that run multiple times</strong> |
6 | 6 | </p>
|
7 | 7 |
|
8 | 8 | <br/>
|
|
13 | 13 |
|
14 | 14 |
|
15 | 15 | <ul>
|
16 |
| - <li>You make a query to the database which returns the result in X milliseconds</li> |
17 |
| - <li>A write event happens to the database and changes some data</li> |
18 |
| - <li>To get the new version of the query-results you now have three options:</li> |
| 16 | + <li>You make a query to the database which returns the result in 100 milliseconds</li> |
| 17 | + <li>A write event occurs on the database and changes some data</li> |
| 18 | + <li>To get the new version of the query's results you now have three options:</li> |
19 | 19 | <ul>
|
20 |
| - <li>Run the query over the database again which takes another X milliseconds</li> |
| 20 | + <li>Run the query over the database again which takes again 100 milliseconds</li> |
21 | 21 | <li>
|
22 | 22 | Write complex code that calculates the new result depending on many different states and conditions
|
23 | 23 | </li>
|
24 | 24 | <li>
|
25 |
| - Use <b>Event-Reduce</b> to calculate the new results without disc-IO <b>nearly instant</b> |
| 25 | + Use <b>Event-Reduce</b> to calculate the new results on the CPU without disc-IO <b>nearly instant</b> |
26 | 26 | </li>
|
27 | 27 | </ul>
|
28 | 28 | </ul>
|
|
37 | 37 |
|
38 | 38 | <br/>
|
39 | 39 |
|
40 |
| -* * * |
| 40 | +## Efficiency |
| 41 | + |
| 42 | +In the [browser demo](https://pubkey.github.io/event-reduce) you can see that for randomly generated events, about **94%** of them could be optimized by EventReduce. In real world usage, with non-random events, this can be even higher. For the different implementations in common browser databases, we can observe an up to **12 times** faster displaying of new query results after a write occurred. |
| 43 | + |
| 44 | +## How they do it |
| 45 | + |
| 46 | +EventReduce uses 17 different `state functions` to 'describe' an event+previousResults combination. A state function is a function that returns a boolean value like `isInsert()`, `wasResultsEmpty()`, `sortParamsChanged()` and so on. |
41 | 47 |
|
42 |
| -### Efficiency |
| 48 | +Also there are 14 different `action functions`. An action function gets the event+previousResults and modifies the results array in a given way like `insertFirst()`, `replaceExisting()`, `insertAtSortPosition()`, `doNothing()` and so on. |
43 | 49 |
|
44 |
| -### When to use this |
| 50 | +For each of our `2^17` state combinations, we calculate which action function gives the same results that the database would return when the full query is executed again. |
45 | 51 |
|
46 |
| -### How it works |
| 52 | +From this state-action combinations we create a big truth table that is used to create a [binary decision diagram](https://github.com/pubkey/binary-decision-diagram). The BDD is then optimized to call as less `state functions` as possible to determine the correct action of an incoming event-results combination. |
47 | 53 |
|
48 |
| -### Implementations |
| 54 | +The resulting optimized BDD is then shipped as the EventReduce algoritm and can be used in different programming languages and implementations. |
49 | 55 |
|
50 |
| -### When not to use this |
| 56 | +## When to use this |
51 | 57 |
|
52 |
| -### Limitations |
| 58 | +You can use this to.. |
53 | 59 |
|
54 |
| -- EventReduce only works with queries that have a predictable sort-order for any given documents https://stackoverflow.com/a/11599283 |
| 60 | +* ..reduce the latency until a change to the database updates your application |
| 61 | +* ..make observing query results more scalable by doing less disk-io |
| 62 | +* ..reduce the bandwith when streaming realtime query results from the backend to the client |
55 | 63 |
|
56 |
| -So if you sort by `gender` and `age` and two documents have the same `gender` and `age` the sorting is not predictable. Therefore you could add the primary key as third sort parameter. |
| 64 | +## Limitations |
57 | 65 |
|
| 66 | +- EventReduce only works with queries that have a [predictable](https://stackoverflow.com/a/11599283) sort-order for any given documents. (you can make any query predicable by adding the primary key as last sort parameter) |
58 | 67 |
|
59 |
| -### Previous Work |
| 68 | +- EventReduce can be used with relational databases but not on relational queries that run over multiple tables/collections. (you can use views as workarround) |
60 | 69 |
|
| 70 | +## Implementations |
61 | 71 |
|
| 72 | +At the moment there is only the [javascript implementation](./javascript/) that you can use over npm. Pull requests for other languages are welcomed. |
62 | 73 |
|
| 74 | +## Previous Work |
63 | 75 |
|
64 |
| -states: |
65 |
| -- wasInResult |
66 |
| -- wasMatching |
67 |
| -- doesMatchNow |
68 |
| -- hasSkip |
69 |
| -- hasLimit |
70 |
| -- wasLimitReached |
71 |
| -- wasSortedBeforeFirst |
72 |
| -- wasSortedAfterLast |
73 |
| -- isSortedBeforeFirst |
74 |
| -- isSortedAfterLast |
75 |
| -- isSortedBeforeFirst |
76 |
| -- isSortedAfterFirst |
77 |
| -- sortParamsChanged |
78 |
| -- previousStateUnknown |
79 |
| -- isDelete |
80 |
| -- isInsert |
81 |
| -- isUpdate |
| 76 | +- Meteor uses a feature called [OplogDriver](https://github.com/meteor/docs/blob/master/long-form/oplog-observe-driver.md) that is limited on queries that do not use `skip` or `sort`. Also watch [this video](https://www.youtube.com/watch?v=_dzX_LEbZyI&t=2047s) to learn how OpLogDriver works. |
82 | 77 |
|
83 |
| -actions: |
| 78 | +- RxDB used the [QueryChangeDetection](https://github.com/pubkey/rxdb/blob/a7202ac7e2985ff088d53d6a0c86d90d0b438467/docs-src/query-change-detection.md) which works by many handwritten if-else comparisons. RxDB will switch to EventReduce in it's next major release. |
84 | 79 |
|
85 |
| -- doNothing |
86 |
| -- insertFirst |
87 |
| -- insertLast |
88 |
| -- insertAtSortPosition |
89 |
| -- replaceExisting |
90 |
| -- removeExisting |
91 |
| -- removeExistingAndInsertAtSortPosition |
92 |
| -- removeLastItem |
93 |
| -- removeFirstItem |
94 |
| -- Fallback: runFullQueryAgain |
| 80 | +- Baqend is [creating a database](https://vsis-www.informatik.uni-hamburg.de/getDoc.php/publications/620/invalidb_4-pages.pdf) that optimizes for realtime queries. Watch the video [Real-Time Databases Explained: Why Meteor, RethinkDB, Parse & Firebase Don't Scale](https://www.youtube.com/watch?v=HiQgQ88AdYo&t=1703s) to learn more. |
0 commit comments