Skip to content

dreamware-nz/recall

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

recall

CI Release Go Reference Go Report Card License: MIT

Embedded, atomic, microsecond-fast key/value + entity store with git-backed time travel, exposed as a CLI and an MCP server. Single static Go binary. No service to run, but transparent multi-client safety via an on-demand local daemon.

Have I seen this thing? Have I acted on every sub-event? What did my state look like an hour ago?recall answers all three.

Why

LLM agents and shell pipelines often need persistent, atomic, very-very-fast memory of "things I've already done" — but spinning up Redis or a database service for that is overkill. recall is:

  • Embedded — single file at ~/.recall/recall.db, no daemon to install.
  • Concurrent-safe — first invocation in $RECALL_HOME becomes a tiny on-demand local daemon; every other recall and recall-mcp process dials in over a unix socket. One bbolt handle, one writer, every op still ACID. Daemon shuts itself down after 5 min idle. Set RECALL_DIRECT=1 to bypass entirely and open bbolt in-process (the legacy single-process path).
  • Atomic — every op is an ACID transaction inside bbolt (the same engine etcd and consul use).
  • Fast — µs-class point lookups, fsync-bound writes (~17ms, or ~322µs batched, or ~30µs with NoSync).
  • Hierarchical — path-shaped keys give you cheap prefix scans (the HTTP-router trick).
  • Time-travellingrecall snapshot commits a consistent copy of the whole store to a git repo. recall at <commit> ... reads any historical state.
  • Dual frontend — same core powers a CLI (recall) and an MCP server (recall-mcp) for LLMs.
  • Zero CGO — pure Go, builds anywhere go builds.

Install

Pre-built binaries

Grab the right archive for your platform from the releases page and drop recall + recall-mcp somewhere on your $PATH.

From source

go install github.com/dreamware-nz/recall/cmd/recall@latest
go install github.com/dreamware-nz/recall/cmd/recall-mcp@latest

From a clone

git clone https://github.com/dreamware-nz/recall.git
cd recall
make install        # -> $(go env GOPATH)/bin/{recall,recall-mcp}

Set RECALL_HOME to override the default location (~/.recall).

Data model

recall exposes five primitives. Pick the one that fits the shape of your question.

Primitive Use when
kv Generic string → bytes
set "Have I seen X?" (membership)
ctr Atomic counters
ent Named records with attributes (a PR, an email, a file)
child Status-bearing items under an entity (a PR's comments, a PR's CI checks)

Everything else (predicates, completion checks) is composition over these.

Quick start: the PR-review use case

# 1. Cheap "have I seen it" set
recall set add seen-prs "dreamware-nz/recall#42"
recall set has seen-prs "dreamware-nz/recall#42"   # -> true

# 2. Rich tracking with sub-events
recall ent put pr "dreamware-nz/recall#42" head_sha=abc state=open

recall child put pr "dreamware-nz/recall#42" comments c-9981 --status=pending body="rename foo"
recall child put pr "dreamware-nz/recall#42" comments c-9982 --status=pending
recall child put pr "dreamware-nz/recall#42" checks   ci/test --status=running
recall child put pr "dreamware-nz/recall#42" checks   ci/lint --status=success

# 3. The completion predicate (composition, no DSL needed)
recall child count pr "dreamware-nz/recall#42" comments --status=pending   # 2
recall child put   pr "dreamware-nz/recall#42" comments c-9981 --status=acted
recall child count pr "dreamware-nz/recall#42" comments --status=pending   # 1

# 4. Snapshot for time travel
recall snapshot -m "after first review pass"
recall log
# dc62d1f1  2026-05-17 08:49:41  after first review pass

# 5. Read historical state
recall at dc62d1f1 child count pr "dreamware-nz/recall#42" comments --status=pending

"Have I acted on the whole PR?"

A done predicate is just two counts:

[ "$(recall child count pr "$PR" comments --status-not=acted 2>/dev/null || \
     recall child count pr "$PR" comments --status=pending)" = "0" ] && \
[ "$(recall child count pr "$PR" checks   --status=failing)" = "0" ] && echo "done"

(For now --status matches exactly; combine multiple counts client-side. A rule DSL may come later if the same predicate gets written twice.)

Head SHA rolled forward

recall child supersede pr "$PR" comments    # mark all old comments superseded
recall ent put pr "$PR" head_sha=newsha     # merge: only head_sha changes

More example use cases

The entity + child pattern fits anywhere a parent thing has many sub-signals you want to track individually:

Use case Entity Children
Threaded conversations thread/<channel>/<ts> messages/<id>, mentions/<user>
Document review doc/<id> sections/<name>, reviewers/<who>
Multi-stage pipelines file/<sha> stages/{download,extract,embed,index}
Notification dedup (just a set) set:notified membership
Webhook idempotency (just a kv) kv:webhook/<provider>/<event_id>
Agent task state task/<id> subtasks/<n> with status
RSS / feed reading feed/<source> items/<id>
Code/doc reviews (any) review/<id> comments/<id>, signoffs/<who>
GitHub PR tracking pr/<owner>/<repo>#<n> comments/, reviews/, checks/

A scripts/gh-to-recall shim ships in the repo to demonstrate the GitHub-PR variant end-to-end, including child_supersede on head-SHA rollover. Use it as a template — the same shape works for email triage, Slack thread followups, document signoff workflows, you name it.

CLI reference

recall kv      set|get|del|has|list
recall set     add|rem|has|members|card
recall ctr     incr|get|set
recall ent     put|get|del|list
recall child   put|get|del|list|count|supersede
recall snapshot [-m msg]
recall log     [-n N]
recall at <ref> <read-cmd...>
recall batch              < ops.jsonl    # write ops, single tx
recall daemon  start|stop|status        # manage the local daemon
recall where

recall help for full usage. Attribute syntax is key=value after positional args, e.g. recall ent put pr 42 head_sha=abc state=open.

MCP server

Run recall-mcp over stdio. Tools mirror the CLI 1:1 — every primitive is available to the LLM.

Example configuration

For Claude Desktop / Crush / any MCP client:

{
  "mcpServers": {
    "recall": {
      "command": "/usr/local/bin/recall-mcp",
      "env": { "RECALL_HOME": "/Users/you/.recall" }
    }
  }
}

Tool surface

Tool Purpose
kv_set / kv_get / kv_has / kv_del / kv_list Plain KV
set_add / set_has / set_rem / set_members / set_card Membership
counter_incr / counter_get Atomic counters
entity_put / entity_get / entity_del / entity_list Named records
child_put / child_get / child_del / child_list / child_count / child_supersede Status-bearing children
snapshot / history_log / read_at Time travel

read_at takes a ref, an op (one of kv_get, kv_has, set_has, set_card, counter_get, entity_get, child_count), and op-specific args. The whole DB at that commit is opened read-only — no need to copy state around.

Architecture

   recall CLI ----\                            +-----------------+
                    >---- unix socket ------->  |  recall daemon  |
   recall-mcp -----/        (auto-spawned)      |  (this process  |
   external script--                            |   owns bbolt)   |
                                                |   internal/     |
                                                |     store       |  <-- single bbolt file ~/.recall/recall.db
                                                |     history     |  <-- go-git repo at ~/.recall/history/
                                                |     daemon      |
                                                +-----------------+

bbolt buckets (path-shaped keys for cheap prefix scans):

kv                          string -> bytes
set:<name>                  member -> empty
ctr                         name   -> int64 (big-endian)
ent:<kind>                  id     -> msgpack(attrs)
child                       <kind>/<id>/<coll>/<child_id> -> msgpack(record)
idx:status                  <kind>/<id>/<coll>/<status>/<child_id> -> empty

The idx:status bucket is the trick that makes child_count --status=pending O(matches), not O(all_children). Status changes update the index atomically inside the same transaction.

Concurrency model

The first recall invocation in a given $RECALL_HOME forks a tiny background daemon that opens the bbolt file once and serves it over a local unix socket ($RECALL_HOME/recall.sock). Every subsequent recall, recall-mcp, script, and cron dials in. bbolt's own OS file lock is the leader-election primitive — at most one daemon per $RECALL_HOME by construction. If two clients race to spawn, the loser's bolt.Open fails on the file lock and the process exits silently; the winner stays.

  • Atomicity: every op runs in a single bbolt transaction inside the one process holding the handle. Batch ships the whole op list in one RPC and commits as one transaction (one fsync amortised across all ops).
  • Reads: bbolt allows many concurrent View transactions, so reads from multiple clients run in parallel inside the daemon.
  • Writes: serialised by bbolt as today, but fan-in across clients gets the batching win for free.
  • Lifecycle: daemon exits on recall daemon stop, on SIGTERM, or after 5 minutes idle (RECALL_DAEMON_IDLE to tune). Next client transparently respawns it.
  • Recovery: if the daemon crashes, the stale socket file is harmless — the next client to spawn one wins the bbolt lock cleanly.
  • No network: unix socket only (loopback TCP on Windows), perms 0600, user-scoped. There is no auth and no remote story by design; if you want a multi-machine graph service, see Loveliness.

Management commands:

recall daemon start     # ensure a daemon is running; prints listen addr
recall daemon status    # ping the live daemon
recall daemon stop      # graceful shutdown
RECALL_DIRECT=1 recall …   # bypass the daemon entirely (open bbolt in-process)

Time-travel semantics

  • A snapshot is bbolt.Tx.WriteTo, which produces a consistent copy of the entire database even under concurrent writes.
  • Snapshots are committed as a single file (recall.db) into a git repo at ~/.recall/history/. Git handles dedup, packing, branching.
  • recall at <ref> <read-cmd> checks out the snapshot blob from that commit into a temp file, opens it read-only, runs the read, closes it. No reflog gymnastics needed.

Snapshots are explicit and cheap — call recall snapshot whenever a checkpoint makes sense (after a logical milestone, on a cron, on shutdown). They are not called on every write because that would destroy the atomic-µs perf budget.

Performance

Measured on Apple M1, go test -bench, default options (durable, fsync on every commit) unless noted. Numbers are per-op.

Operation Time Notes
KVGet (hot) 0.86 µs mmap-backed B+tree, sub-microsecond
KVHas (hit) 1.0 µs
KVHas (miss) 1.1 µs misses just as fast — no Bloom filter needed
SHas in 100k-member set 0.94 µs
SHas miss in 100k set 0.90 µs
CtrGet 0.76 µs
ChildCount(status=pending) over 1k children 11.9 µs uses idx:status
ChildList(status=pending) returning 500 751 µs ~1.5 µs per result
KVSet (one tx) 18.9 ms fsync-bound
SAdd (one tx) 18.0 ms fsync-bound
ChildPut (one tx) 16.7 ms fsync-bound
Batch of 50 ChildPut ops 16.1 ms total → 322 µs/op one fsync amortised
KVSet with NoSync 29.5 µs
ChildPut with NoSync 39.9 µs
SAdd with NoSync 33.5 µs

What this means

Reads are genuinely Redis-class. Writes are fsync-bound on default settings — about 60 writes/sec sequentially. This is fine for the interactive use case (one PR sync is dozens of writes, ~1 second), but if you need bulk throughput, use one of the two knobs below.

Knob 1: Batch — many ops, one fsync

err := s.Batch(func(b *store.WriteTx) error {
    _ = b.SAdd("seen-prs", id)
    _, _ = b.EntPut("pr", id, attrs, true)
    for _, c := range comments {
        _, _ = b.ChildPut("pr", id, "comments", c.ID, "pending", nil, false)
    }
    return nil
})

All ops commit together (or none do — full rollback on error). Throughput jumps ~50× because the single fsync at the end is the only durability cost.

The CLI exposes this as recall batch, reading JSONL from stdin:

{
  echo '["set","add","seen-prs","pr#42"]'
  echo '["ent","put","pr","pr#42","head_sha=abc"]'
  echo '["child","put","pr","pr#42","comments","c1","--status=pending"]'
  # ...
} | recall batch

scripts/gh-to-recall sync uses this — one PR sync is one fsync.

Knob 2: NoSync — trade durability for ~500× throughput

s, err := store.OpenWith(path, store.Options{NoSync: true})

Or set RECALL_NOSYNC=1 in the environment. Writes drop to ~30–40 µs. Use this for cache-grade data where losing the last few writes after a crash is acceptable. Call s.Sync() to force a manual fsync at logical checkpoints.

Don't mix NoSync with primary-state data unless you've thought hard about the crash window.

Why not Redis / SQLite / Loveliness / Dolt?

  • Redis: needs a service to install and run, not embedded, no time travel. recall's daemon is local-only, auto-spawned, auto-shutdown — it's not a service you manage.
  • SQLite: great alternative; we picked bbolt for purer KV ergonomics and zero-CGO build.
  • Loveliness (sibling project): a clustered graph DB with Raft and Bolt protocol — exactly the opposite of what recall is. Use Loveliness when you need a graph service across machines; use recall when one machine needs fast, durable, atomic memory shared across many processes.
  • Dolt: "Git for data" with SQL — heavier dep, single-writer-ish, but worth considering if branching/merging data (not just snapshots) becomes a hard requirement.

Roadmap

  • TTL / expiry on KV and sets.
  • Optional declarative rule files for completion predicates (only if the same predicate keeps appearing).
  • HTTP frontend (for non-MCP clients) — probably never; the CLI already covers it.
  • Optional Bloom-filter cache in front of set_has for huge sets (only if profiling demands it).

Contributing

Issues and PRs welcome. The bar:

  1. go test ./... passes.
  2. go vet ./... clean.
  3. New behaviour gets a test.
  4. Public API changes update the README.

CI runs build + test + vet on Linux, macOS, and Windows on every push and PR. A tag matching v*.*.* triggers a goreleaser run that publishes cross-platform binaries and checksums.

make test       # go test ./...
make build      # produce ./bin/{recall,recall-mcp}
make install    # to $(go env GOPATH)/bin

Status

Early but functional. Core primitives, CLI, MCP, time travel, batched writes, no-sync mode, and tests all work. API may change before 1.0.

License

MIT — see LICENSE.

About

Embedded atomic KV/entity store with git-backed time travel. Single static binary. CLI + MCP server for LLMs.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors