Skip to content

feat(hermes): add on_pre_compress hook to preserve context before compression#870

Open
pyrate-llama wants to merge 1 commit intovectorize-io:mainfrom
pyrate-llama:feat/hermes-on-pre-compress
Open

feat(hermes): add on_pre_compress hook to preserve context before compression#870
pyrate-llama wants to merge 1 commit intovectorize-io:mainfrom
pyrate-llama:feat/hermes-on-pre-compress

Conversation

@pyrate-llama
Copy link
Copy Markdown

Summary

  • Adds an on_pre_compress lifecycle hook to the Hermes integration that fires before context window compression
  • Captures the last 10 user/assistant messages and persists them to the Hindsight knowledge graph via retain()
  • Runs the retain call in a background thread so it never blocks the compression pipeline

Motivation

When Hermes compresses its context window to save tokens, the original messages are summarized and the full text is discarded. This means potentially valuable context is lost forever. By hooking into the compression lifecycle, we can preserve those messages in Hindsight's knowledge graph where they remain searchable via recall() and reflect().

This was inspired by ByteRover's on_pre_compress implementation, adapted to use Hindsight's sync retain() API.

Details

  • Messages are truncated to 500 chars each to keep the retain payload reasonable
  • Tagged with [Pre-compression context] prefix for easy identification
  • Requires hermes-agent >= 0.7.0 which supports the on_pre_compress hook
  • On older hermes-agent versions, the hook is simply never called (safe to register)

Test plan

  • Tested locally with Hermes v0.7.0 and Hindsight in local/embedded mode
  • Verified pre-compression messages appear in knowledge graph after context compression
  • Confirmed background thread does not block the compression pipeline

…pression

When Hermes compresses its context window to save tokens, the original
messages are summarized and the full text is discarded. This adds an
on_pre_compress lifecycle hook that fires just before compression,
capturing the last 10 user/assistant messages and persisting them to
the Hindsight knowledge graph via retain().

The retain call runs in a background thread so it never blocks the
compression pipeline. Messages are truncated to 500 chars each and
tagged with [Pre-compression context] for easy identification.

Requires hermes-agent >= 0.7.0 which supports the on_pre_compress hook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant