feat: multi-signal checkpoint linkage for resilience across git rewrites#840
feat: multi-signal checkpoint linkage for resilience across git rewrites#840peyton-alt wants to merge 12 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Stores a commit’s git tree hash in checkpoint metadata during condensation so the web side can re-link checkpoints after history rewrites that drop the Entire-Checkpoint trailer, and tweaks prepare-commit-msg behavior so agent-driven revert/cherry-pick operations still get checkpoint trailers when a session is active.
Changes:
- Add
tree_hashto committed checkpoint metadata (CommittedMetadata,WriteCommittedOptions) and populate it during PostCommit condensation from the HEAD commit’s tree hash. - Allow prepare-commit-msg to proceed during git sequence operations (revert/cherry-pick/rebase) when an ACTIVE session exists in the current worktree; keep skipping when no active session.
- Add unit tests for
tree_hashpersistence and the new revert trailer behavior split by agent-active vs user/manual.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/strategy/manual_commit_test.go | Adds tests ensuring agent revert gets a trailer when session is ACTIVE and user revert is skipped when not active. |
| cmd/entire/cli/strategy/manual_commit_session.go | Adds helper to detect whether any session in the current worktree is ACTIVE. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Adjusts PrepareCommitMsg sequence-operation skip logic; passes commit tree hash into condensation options. |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Threads treeHash through condensation options into committed checkpoint write options. |
| cmd/entire/cli/checkpoint/committed.go | Writes TreeHash into per-session committed metadata.json. |
| cmd/entire/cli/checkpoint/checkpoint.go | Extends checkpoint option/metadata structs to include TreeHash serialized as tree_hash. |
| cmd/entire/cli/checkpoint/checkpoint_test.go | Adds coverage that tree_hash is written and read back from committed metadata. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7b9d87d to
8302847
Compare
182d479 to
e70d0eb
Compare
When an agent runs git revert or cherry-pick as part of its work, the commit should be checkpointed. Previously prepare-commit-msg unconditionally skipped during sequence operations, making the agent's work invisible to Entire. Now checks for active sessions: if an agent session is ACTIVE, the operation is agent-initiated and gets a trailer. If no active session, it's user-initiated and is skipped as before. Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 85df9ac94bc7
Add tree_hash field to committed checkpoint metadata. Records the git tree hash of the commit being condensed, enabling fallback checkpoint lookup by tree hash when the Entire-Checkpoint trailer is stripped by git history rewrites (rebase, filter-branch, amend). Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 77773a25069e
- Add debug logging to hasActiveSessionInWorktree error paths - Remove unrelated files (greetings.md, agent configs) from PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: bacd9b68b1c0
Define content-based linkage signals (tree_hash, patch_id, files_changed_hash, session_files_hash) for re-linking checkpoints after git history rewrites. Stored at checkpoint level, not per-session. Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4661e8c50610
Needed by linkage signal tests that verify patch ID stability across rebase. Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4129c08c80b0
ComputePatchID: git patch-id of the commit diff, survives rebase. ComputeFilesChangedHash: SHA256 of sorted file:blob pairs, survives rebase even with conflicts in non-agent files. Uses single git ls-tree call for all files (O(1) subprocess). Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c100b592e3a0
Replace per-session TreeHash with checkpoint-level LinkageMetadata containing tree_hash, patch_id, files_changed_hash, and session_files_hash. Computed in PostCommit handlers, passed through condenseOpts to CondenseSession, written to CheckpointSummary on entire/checkpoints/v1. Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 75ce05cfe11b
Verify LinkageMetadata is stored in CheckpointSummary and readable. Also verify nil linkage is omitted (backward compat with old checkpoints). Part of fix for #834. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 10dae87903d7
gofmt stripped nolint directives from capabilities.go. Restore from main. Add encoding/hex import for ComputeFilesChangedHash. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c26d3dce1d32
- Restore nolint:ireturn on capabilities.go (gofmt stripped them) - Set user.name/email in gitops initTestRepo for CI compatibility (git rebase fails without repo-level config on CI runners) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f1ca53b63c79
- Add omitempty to all LinkageMetadata JSON tags for consistency - Return error for malformed git ls-tree lines instead of silent skip - Compute commit-level linkage once (not per-session) via baseLinkage cache; only SessionFilesHash varies per session - Add code comment explaining deferred condensation for agent reverts - Add integration test verifying full linkage pipeline (PostCommit → condensation → ReadCommitted with all four signals populated) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: e3539c5bfa31
- Replace unreachable Fields/len guard with strings.Cut in ComputePatchID - Use logCtx variable in linkageForSession for logging consistency - Use strings.TrimSpace in revParse test helper instead of raw byte slice Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 91b06c4aabdb
0dfcba0 to
703bd67
Compare

Summary
Store multiple content-based linkage signals in checkpoint metadata so the web can automatically re-link checkpoints after git history rewrites (rebase, reword, amend, filter-branch) without user intervention.
linkageblock toCheckpointSummaryonentire/checkpoints/v1with four signals:tree_hash,patch_id,files_changed_hash,session_files_hashTreeHashfromCommittedMetadata(superseded by checkpoint-levellinkage)Problem (#834)
When users rewrite git history (
git rebase,git rebase -ireword,git filter-branch), theEntire-Checkpointtrailer can be stripped from the commit message. The checkpoint data still exists onentire/checkpoints/v1, but the rewritten commit no longer points to it.Tree hash alone doesn't survive rebase — rebasing onto a new base changes the full tree snapshot even if the feature's own changes are identical.
How multi-signal linkage fixes this
Each signal captures a different aspect of the commit's identity. The web uses a fallback chain:
tree_hashpatch_idgit patch-id --stable)files_changed_hashsession_files_hashThe only case with no match is when the agent's actual code was modified (e.g., conflict resolution in agent-touched files) — which is semantically correct.
What's in this PR (CLI-side)
LinkageMetadatastruct with four fixed-size hash fields (allomitemptyfor clean JSON)ComputePatchID: pipesgit diff-tree -pthroughgit patch-id --stableComputeFilesChangedHash: singlegit ls-treecall, SHA256 of sorted file:blob pairscomputeBaseLinkage()(cached), withSessionFilesHashadded per-session vialinkageForSession()CheckpointSummaryonentire/checkpoints/v1git revert/cherry-pickgets checkpoint trailer when session is ACTIVE (condensation deferred to next normal commit)What's needed on the web side
patch_id,files_changed_hash,session_files_hashcolumns torepo_checkpointsTesting
Automated
TestComputePatchID/TestComputePatchID_StableAcrossRebase/TestComputePatchID_InitialCommitTestComputeFilesChangedHash/TestComputeFilesChangedHash_StableAcrossRebaseTestWriteCommitted_IncludesLinkage/TestWriteCommitted_NilLinkageOmittedTestShadowStrategy_PostCommit_LinkagePopulated— full pipeline integration testTestShadowStrategy_PrepareCommitMsg_AgentRevertGetsTrailer/_UserRevertSkippedManual verification (local binary)
Tested with a real Claude Code session against a local repo:
entire/checkpoints/v1metadata contains the full linkage block:{ "linkage": { "tree_hash": "d82e8f38e91d7e1efca4993ff7c4023313a55292", "patch_id": "e00217d29a078e2bd1e2d16e289a9cdc78c41df7", "files_changed_hash": "4c22e14d5ce80bdd2f..." } }Patch ID survives rebase — created a feature branch, committed, rebased onto main with new commits. Patch ID
7a94b75a...was identical before and after rebase, while tree hash changed (expected).Tree hash survives reword —
git commit --amend -m "new msg"preserved the tree hash exactly.Test plan
metadata.jsononentire/checkpoints/v1after a commit🤖 Generated with Claude Code