Batch/throttle graph extraction instead of triggering an LLM call on every Stop hook

`mem::graph-extract` (and the LLM-backed `provider.compress()` call it makes) is only wired to `event::session::stopped`, which the Claude Code plugin fires via the `Stop` hook at the end of *every assistant turn* — not just when a logical session truly ends. With `GRAPH_EXTRACTION_ENABLED=true`, this means an Anthropic API call fires every few minutes during active use, even though the graph itself is merged incrementally and cheap.

Could graph extraction support a batched/throttled mode — e.g. a debounce/interval (only extract if N minutes have passed since the last extraction for a session) or an opt-in nightly cron-style batch via the existing `/agentmemory/graph/build` endpoint — so users don't pay one LLM call per turn? Right now the only way to avoid the per-turn cost is to disable graph extraction entirely, which seems like an unnecessarily blunt tradeoff between cost and graph coverage.

Noticed `GRAPH_EXTRACTION_BATCH_SIZE` already exists in `config.ts` but isn't read anywhere in the codebase — might be a leftover hook for exactly this kind of batching.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch/throttle graph extraction instead of triggering an LLM call on every Stop hook #978

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Batch/throttle graph extraction instead of triggering an LLM call on every Stop hook #978

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions