[Hackathon] openai-llm: semantic memory plugin (recall + TTL + LRU)#4
[Hackathon] openai-llm: semantic memory plugin (recall + TTL + LRU)#4mariagorskikh wants to merge 2 commits into
Conversation
The Memory layer only shipped the `blackboard` plugin (a shared dict), which is the wrong shape for LLM agents that need to recall the most relevant past interaction given a prompt. This adds `memory:semantic`: a drop-in `Memory` implementation that satisfies the existing read/write/subscribe/cas protocol but additionally exposes: - `recall(query, k, min_score)`: top-k similarity search over stored values, ranked by cosine similarity on a deterministic hashed bag-of-(tokens + char-trigrams) embedder. No external service, no API key, byte-identical results across runs — preserves NEST's Tier 1 determinism guarantee. - `forget(key)` and `stats()` for observability. - Optional `capacity` with LRU eviction; recalled entries refresh their LRU position so useful memories survive eviction. - Optional `ttl` with logical-clock expiration; pass `now_fn` to share a clock with the simulator. Registered as `memory:semantic` in the built-in plugin table so scenarios can opt in by editing one YAML line. Verified end-to-end with `nest run marketplace.yaml` — all three marketplace validators pass and traces remain byte-identical across runs with the same seed. Tested: 20 new unit tests covering protocol conformance, recall ranking, determinism across instances, min-score filtering, LRU eviction, TTL expiration (including overwrite-resets-TTL), recall refreshing LRU position, binary payloads, input validation, and plugin-registry resolution. Full suite: 279 passed. https://claude.ai/code/session_01C5j2D4MgCkPgsjSCqBVpWW
Reviewer's GuideAdds a new deterministic semantic memory plugin implementing the full Memory protocol, with similarity-based recall, TTL expiration, and LRU eviction, wires it into the plugin registry and docs, and provides a focused test suite to validate behavior and determinism. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
SemanticMemory.subscribe, consider removing the key from_subscriberswhen its list becomes empty in thefinallyblock to avoid unbounded growth of empty subscriber lists over long-running simulations with many distinct keys. - The
_Entrydataclass includes fields liketextandforgottenthat are never read; consider removing or using them to reduce cognitive overhead and keep the internal model minimal.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `SemanticMemory.subscribe`, consider removing the key from `_subscribers` when its list becomes empty in the `finally` block to avoid unbounded growth of empty subscriber lists over long-running simulations with many distinct keys.
- The `_Entry` dataclass includes fields like `text` and `forgotten` that are never read; consider removing or using them to reduce cognitive overhead and keep the internal model minimal.
## Individual Comments
### Comment 1
<location path="packages/nest-plugins-reference/nest_plugins_reference/memory/semantic.py" line_range="274-278" />
<code_context>
+ while True:
+ yield await q.get()
+ finally:
+ self._subscribers[key].remove(q)
+
+ async def cas(self, key: str, expected: bytes, new: bytes) -> bool:
</code_context>
<issue_to_address>
**suggestion (performance):** Subscriber cleanup leaves empty lists in `_subscribers`, which can accumulate keys over time.
After removing `q`, empty lists remain in `self._subscribers`, so keys accumulate over time. Consider cleaning up empty entries:
```python
self._subscribers[key].remove(q)
if not self._subscribers[key]:
del self._subscribers[key]
```
This avoids unbounded growth while preserving behavior for active subscriptions.
```suggestion
try:
while True:
yield await q.get()
finally:
subscribers = self._subscribers.get(key)
if subscribers is not None:
subscribers.remove(q)
if not subscribers:
del self._subscribers[key]
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| try: | ||
| while True: | ||
| yield await q.get() | ||
| finally: | ||
| self._subscribers[key].remove(q) |
There was a problem hiding this comment.
suggestion (performance): Subscriber cleanup leaves empty lists in _subscribers, which can accumulate keys over time.
After removing q, empty lists remain in self._subscribers, so keys accumulate over time. Consider cleaning up empty entries:
self._subscribers[key].remove(q)
if not self._subscribers[key]:
del self._subscribers[key]This avoids unbounded growth while preserving behavior for active subscriptions.
| try: | |
| while True: | |
| yield await q.get() | |
| finally: | |
| self._subscribers[key].remove(q) | |
| try: | |
| while True: | |
| yield await q.get() | |
| finally: | |
| subscribers = self._subscribers.get(key) | |
| if subscribers is not None: | |
| subscribers.remove(q) | |
| if not subscribers: | |
| del self._subscribers[key] |
Which piece + why
Layer 10: Memory. The default
blackboardplugin is a shared dict — perfect for state-machine agents that already know the key they want, but the wrong shape for the thing LLM agents actually do: "recall the most relevant past interaction given this prompt." The Memory layer had exactly one reference plugin; this PR adds a second that is genuinely useful for the retrieval-augmented swarm coordination scenarios NEST is meant to stress-test.Core idea
A new
memory:semanticplugin that satisfies the fullMemoryprotocol (read / write / subscribe / cas) so it is a drop-in replacement forblackboard, but layers three LLM-agent-relevant capabilities on top:recall(query, k, min_score)returns the top-k most similar stored values by cosine similarity over a deterministic hashed bag-of-(tokens + character trigrams) embedder. No external service, no API key, no GPU — and crucially byte-identical across runs, so NEST's "same seed → identical trace" guarantee survives. Character trigrams give morphological signal sorecall("apple buyer", k=1)actually finds a memory that says "I want to buy apples".now_fnto drive the clock from the simulator, or let the plugin tick its own counter internally. Overwriting a key resets its TTL; recall does not (popular-but-stale entries still expire).Plus
forget(key)andstats()for observability.Registered under the built-in plugin table as
memory:semantic, so scenarios opt in by changing one YAML line. The default staysblackboard— no behavior change for anyone who doesn't ask for it.How to test
Highlights of the test suite:
isinstance(mem, Memory)and registry resolution.SemanticMemory()instances, same writes → identical recall results (key, value, score).aprotects it from eviction when capacity overflows.Key assumptions
hashis process-salted, so the embedder uses FNV-1a 64-bit explicitly. Same input → same vector, on every machine, every Python version.memory:openai_embeddingsas a separate plugin behind the same surface; that one would be Tier-2-only because its outputs aren't reproducible. Keeping that out of this PR keeps the contribution focused and the determinism guarantee intact.Persona
OpenAI researcher building LLM agent orchestration; deeply interested in agent memory architectures and what gets remembered vs. evicted when many LLM agents talk to each other.
Future work
memory:openai_embeddingsandmemory:anthropic_embeddingsplugins behind the samerecallsurface — Tier 2 only.memory_swarm.yaml) where N agents share one boundedSemanticMemoryand have to coordinate via recall under message drop + Byzantine fractions. The natural validator: did the swarm converge on the right memory, or did the relevant fact get evicted?mem.stats()(e.g. eviction rate below threshold, no expiration storms).memory:*plugin names.https://claude.ai/code/session_01C5j2D4MgCkPgsjSCqBVpWW
Generated by Claude Code
Summary by Sourcery
Add a new semantic memory plugin that extends the Memory layer with deterministic similarity-based recall, capacity limits, TTL expiration, and observability, and wire it into the built-in plugin registry and docs.
New Features:
Enhancements:
memory:semanticoption in the plugin registry so it can be selected from YAML scenarios.Documentation:
Tests: