A lossless long-term memory store for AI agents, built on SQLite with FTS5 full-text search and a DAG-based summarization layer.
- Lossless storage — every message is stored verbatim, nothing is discarded
- FTS5 full-text search — fast relevance-ranked retrieval across all memories
- DAG summaries — hierarchical summarization tree compresses history without data loss
- Incremental summarization — per-session checkpoint so repeated
summarizeis fast and idempotent - Context assembler — smart token-budget-aware context window builder with de-duplication
- Config file — JSON config via
--configfor all default settings - Bulk ingest —
ingest --jsonaccepts a JSON object or array from stdin - CLI —
almcommand for all operations - TypeScript — fully typed, CommonJS, Node 18+
npm ci
npm run build
# Add memories
node dist/cli.js add my-session user "Hello, I am working on a TypeScript project"
node dist/cli.js add my-session assistant "Great! What kind of project?"
# Bulk ingest from JSON
echo '[{"role":"user","content":"msg1"},{"role":"assistant","content":"reply1"}]' \
| node dist/cli.js ingest my-session --json
# Search
node dist/cli.js search "TypeScript" --session my-session
# Build DAG summaries (incremental — safe to re-run)
node dist/cli.js summarize my-session
# Assemble a context window
node dist/cli.js context my-session --query "TypeScript" --max-tokens 2000
# Assemble with debug token budget info
node dist/cli.js context my-session --debugCreate a JSON file (e.g. alma.config.json) and pass it via --config:
{
"dbPath": "./my-project.db",
"keepRecentRaw": 30,
"leafChunkSize": 600,
"fanIn": 4,
"tokenBudget": 8000,
"retrievalLimit": 50
}node dist/cli.js --config alma.config.json context my-sessionConfig fields (all optional, fall back to defaults):
| Field | Default | Description |
|---|---|---|
dbPath |
./alma.db |
Path to SQLite database (overridden by ALMA_DB env var) |
keepRecentRaw |
20 |
Recent raw messages kept outside summaries |
leafChunkSize |
800 |
Max tokens per leaf summary chunk |
fanIn |
4 |
Summary nodes merged per parent node |
tokenBudget |
4000 |
Default token budget for context assembly |
retrievalLimit |
30 |
Default FTS retrieval limit |
| Command | Description |
|---|---|
alm add <session> <role> <content> |
Insert a memory |
alm ingest <session> --json |
Bulk-ingest JSON from stdin (object or array) |
alm list <session> |
List memories for a session |
alm search <query> |
Full-text search |
alm sessions |
List all session IDs |
alm summarize <session> |
Build/update DAG summaries (incremental) |
alm roots <session> |
Show root summaries |
alm context <session> |
Assemble context window |
alm delete <id> |
Delete a memory by ID |
alm tg-poll |
Long-poll Telegram and ingest into SQLite |
alm alma-tail |
Tail Alma logs and ingest into SQLite |
Global options:
--config <path>— path to JSON config file
Command options:
add -i, --importance <n>— importance score 0-1 (default: 0.5)list -n, --limit <n>— max results (default: 50)search -s, --session <id>— filter by sessionsearch -n, --limit <n>— max results (default: 10)context -q, --query <text>— relevance query for FTS boostcontext -t, --max-tokens <n>— token budget (default: configtokenBudget)context --debug— print token budget usage breakdown
Environment variables:
ALMA_DB— path to SQLite database file (default:./alma.db)
This ingests all incoming updates into the same SQLite database, storing the raw update JSON in metadata and using deterministic IDs to avoid duplicates.
# env var
export TELEGRAM_BOT_TOKEN="..."
# allowlist recommended
node dist/cli.js tg-poll \
--allowlist <CHAT_ID_1>,<CHAT_ID_2> \
--session-mode chat_topic \
--offset-file ./tg.offset.jsonSession mapping:
chat→tg:<chat_id>chat_topic→tg:<chat_id>:<message_thread_id>(when thread id exists)
Deterministic IDs:
- normal:
tg:<chat_id>:<message_id> - edited:
tg:<chat_id>:<message_id>:edit:<edit_date>
This tails Alma log files under ~/.config/alma/chats and ~/.config/alma/groups and ingests new lines incrementally.
node dist/cli.js alma-tail \
--allowlist <CHAT_ID_1>,<CHAT_ID_2> \
--session-mode chat_date \
--state-file ./alma-tail.state.jsonSession mapping:
chat→alma:<chatId>chat_date→alma:<chatId>:<YYYY-MM-DD>chat_msg→alma:<chatId>:<YYYY-MM-DD>:<msgId>
Deterministic IDs:
alma:<chatId>:<YYYY-MM-DD>:<msgId>:<who>(who = alma|user)
alma-lossless-memory/
├── src/
│ ├── db/
│ │ ├── database.ts # SQLite connection + migrations
│ │ ├── migrations.ts # Migration runner
│ │ └── schema.ts # Schema SQL (v1-v4)
│ ├── memory/
│ │ ├── types.ts # TypeScript interfaces
│ │ ├── tokenizer.ts # Token estimation
│ │ └── store.ts # CRUD + FTS5 search
│ ├── dag/
│ │ └── summarizer.ts # Incremental DAG summary builder
│ ├── context/
│ │ └── assembler.ts # Context window assembler (de-dup + debug)
│ ├── fts/
│ │ └── sanitizer.ts # FTS5 query sanitizer
│ ├── config.ts # JSON config loader
│ ├── cli.ts # Commander CLI
│ └── index.ts # Public API exports
├── tests/
│ ├── memory.test.ts
│ ├── dag.test.ts
│ ├── context.test.ts
│ └── tokenizer.test.ts
└── .github/
└── workflows/
└── ci.yml # GitHub Actions CI (Node 20 & 22)
npm ci # install dependencies
npm run build # compile TypeScript
npm test # run Jest suite
npm run lint # type-check onlyMIT