You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
-50Lines changed: 0 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -366,56 +366,6 @@ tail_preserved=20
366
366
367
367
---
368
368
369
-
## squeez vs MemPalace
370
-
371
-
MemPalace ([milla-jovovich/mempalace](https://github.com/milla-jovovich/mempalace)) is the closest comparable project — Python, ChromaDB, semantic memory, MCP server, self-described "highest-scoring AI memory system ever benchmarked." We audited its code and benchmarks. Here is what we found.
372
-
373
-
### Architecture comparison
374
-
375
-
| Dimension | squeez | MemPalace |
376
-
|-----------|--------|-----------|
377
-
|**Language / runtime**| Rust, single static binary (703 KB) | Python 3.10+, ~80 MB ML stack |
378
-
|**Dependencies**|`libc` only (Unix signal fwd) |`chromadb`, `sentence-transformers`, `pyyaml`, `sqlite3`, … |
|**Test coverage**| 35 integration test files, 287 tests | 4 test files, 0 AAAK/compression tests |
385
-
|**Latency**| < 0.3 ms p50 filter mode | Not reported (embedding inference ~50–200 ms) |
386
-
|**Token reduction**| Up to 95%, 92.8% aggregate | AAAK: ~30–40% (ad-hoc, no systematic measurement) |
387
-
388
-
### Benchmark audit
389
-
390
-
MemPalace claims 96.6% R@5 on LongMemEval-S. We replicated the benchmark methodology from `benchmarks/`:
391
-
392
-
| Claim | Reality |
393
-
|-------|---------|
394
-
|**96.6% R@5**| Achieved with **raw ChromaDB** — no palace structure, no AAAK encoding |
395
-
|**AAAK encoding**| Degrades to **84.2% R@5** (−12.4 pp vs. raw) |
396
-
|**Rooms mode**| Degrades to **89.4% R@5** (−7.2 pp vs. raw) |
397
-
|**+34% from palace structure**| Numbers (60.9 / 73.1 / 94.8 with 22,000 memories) **do not appear in benchmarks/**; no reproducer |
398
-
|**100% R@5** (claimed) | Teaching-to-test: 3 hand-crafted questions whose answers are in training. Honest held-out = 98.4%. |
399
-
|**LoCoMo 100%**| Gamed: top-k=50 exceeds 19–32 sessions per conversation, so ground truth is always in the candidate pool. |
400
-
401
-
**Summary:** the ChromaDB embedding layer genuinely scores well on long-context retrieval benchmarks. The palace metaphor (wings/rooms/halls/closets/drawers/tunnels) is ChromaDB metadata — it degrades accuracy. AAAK compression degrades accuracy further.
402
-
403
-
### What we adopted from the audit
404
-
405
-
The gap analysis informed squeez 0.3+:
406
-
407
-
| MemPalace technique | Adopted in squeez | Notes |
0 commit comments