Incremental topK with fractional index #72

kevin-dp · 2025-06-23T11:58:37Z

This PR introduces changes to the topK operator because the previous implementation was not incremental. This PR provides 2 implementations: an array and a B+ tree implementation. The array implementation internally keeps a sorted array of elements to efficiently find the position where to insert/delete but the actual insertion/deletion is still in linear time. This is fine for small to medium collections. For big collections, we want to use the B+ tree implementation such that insertions and deletions are in logarithmic time.

TODO:

Benchmark these 2 implementations to confirm their theoretical time complexity holds in practice

…plementation details.

KyleAMathews

great stuff! Looking forward to seeing the benchmarks

KyleAMathews · 2025-06-23T13:21:34Z

packages/d2mini/package.json

@@ -50,6 +50,7 @@
  },
  "dependencies": {
    "fractional-indexing": "^3.2.0",
-    "murmurhash-js": "^1.0.0"
+    "murmurhash-js": "^1.0.0",
+    "sorted-btree": "^1.8.1"


5.8kb gzipped https://bundlephobia.com/package/[email protected]

This is enough extra code weight (~24% increase to tanstack/db) that depending on where the crossover point ends up being, this could be an opt-in thing. I.e. only use if you have 50k+ items in a a collection.

Yes, that's the idea. We want to do some initial benchmarking to see when the turnover point is between using the array version or the tree version. We could automatically switch between them based on the size of the collection.

Ok perfect, yeah that'd be easy with an async import 🚀

samwillis

Thanks @kevin-dp, all looks really good!

My one suggestion is we split the BTree version into a serrate operator, in a separate file.

So the array version is topKWithFractionIndex and orderByWithFractionIndex, and then we have a separate topKWithFractionIndexBTree and orderByWithFractionIndexBTree. That way when the Btree isn't used it won't be bundled - at the moment the condition on which implementation to use will cause the Btree to be pulled in all the time. It should be possible to do this without duplication is you subclass TopKWithFractionalIndexOperator as TopKWithFractionalIndexBtreeOperator.

samwillis

One note, and it needs a changeset, but other than that

packages/d2mini/src/operators/orderBy.ts

kevin-dp · 2025-06-24T09:17:10Z

One note, and it needs a changeset, but other than that

I'd like to benchmark it before we ship it, to make sure the two versions perform as expected.

kevin-dp added 8 commits June 18, 2025 16:59

WIP incremental topKWithFractionalIndex

72085eb

Incremental topKWithFractionalIndex

93f71ed

Fix tests to not assume particular fractional indices as those are im…

d4633d4

…plementation details.

Introduce a TopK data structure

936c762

B+ tree variant of topKWithFractionalIndex

cf12510

Extend unit tests to test all insertion and deletion cases

06f54fc

Formatting

f52f2d7

Unit test for duplicate values

2ae5758

KyleAMathews reviewed Jun 23, 2025

View reviewed changes

samwillis requested changes Jun 24, 2025

View reviewed changes

kevin-dp added 5 commits June 24, 2025 10:11

Expose useTree option also on sortBy operator

5d7f39d

Split array and B+ tree variants in separate operators

71e8483

Add missing imports

28684b4

Add a orderBy operator that uses topK with B+ tree variant

1581792

Formatting

6426eaf

kevin-dp requested a review from samwillis June 24, 2025 08:45

samwillis approved these changes Jun 24, 2025

View reviewed changes

packages/d2mini/src/operators/orderBy.ts Outdated Show resolved Hide resolved

kevin-dp added 2 commits June 24, 2025 11:13

Trigger CI

03aa36e

Remove useTree option

9d890e2

Changeset

db524d7

kevin-dp closed this Jun 24, 2025

kevin-dp reopened this Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incremental topK with fractional index #72

Incremental topK with fractional index #72

Uh oh!

kevin-dp commented Jun 23, 2025

Uh oh!

KyleAMathews left a comment

Uh oh!

KyleAMathews Jun 23, 2025

Uh oh!

kevin-dp Jun 23, 2025

Uh oh!

KyleAMathews Jun 23, 2025

Uh oh!

samwillis left a comment

Uh oh!

samwillis left a comment

Uh oh!

Uh oh!

kevin-dp commented Jun 24, 2025

Uh oh!

Uh oh!

Incremental topK with fractional index #72

Are you sure you want to change the base?

Incremental topK with fractional index #72

Uh oh!

Conversation

kevin-dp commented Jun 23, 2025

Uh oh!

KyleAMathews left a comment

Choose a reason for hiding this comment

Uh oh!

KyleAMathews Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

kevin-dp Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

KyleAMathews Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

samwillis left a comment

Choose a reason for hiding this comment

Uh oh!

samwillis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kevin-dp commented Jun 24, 2025

Uh oh!

Uh oh!