Conversation
Entire-Checkpoint: 209a37190167
There was a problem hiding this comment.
Pull request overview
Adds an initial entire migrate CLI command intended to migrate v1 checkpoints to the v2 checkpoint ref/layout for testing and rollout prep.
Changes:
- Registers a new
migratesubcommand on the root CLI. - Introduces v1→v2 checkpoint migration logic, including transcript compaction and attempted task-metadata tree copying.
- Adds unit tests covering basic migration flows and idempotency.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| cmd/entire/cli/root.go | Registers the new migrate command in the root CLI. |
| cmd/entire/cli/migrate.go | Implements v1→v2 migration logic, transcript compaction, and task metadata tree splicing. |
| cmd/entire/cli/migrate_test.go | Adds tests for migration behavior (basic/idempotent/multi-session/flag validation). |
Entire-Checkpoint: c9595c52ab4a
Entire-Checkpoint: 9f07aeebbf93
Entire-Checkpoint: f1c37c8efc47
Entire-Checkpoint: 36db97269a69
Entire-Checkpoint: 93066e1dac3c
Entire-Checkpoint: 4fdb72622b7f
|
bugbot run |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Root-level tasks overwritten by per-session tasks splice
copyTaskMetadataToV2now merges root-level and latest-session task trees before splicing so both task sets are preserved instead of one overwriting the other.
Or push these changes by commenting:
@cursor push 033e8c1ad0
Preview (033e8c1ad0)
diff --git a/cmd/entire/cli/migrate.go b/cmd/entire/cli/migrate.go
--- a/cmd/entire/cli/migrate.go
+++ b/cmd/entire/cli/migrate.go
@@ -359,14 +359,18 @@
return err
}
+ latestSessionIdx := -1
+ if len(summary.Sessions) > 0 {
+ latestSessionIdx = len(summary.Sessions) - 1
+ }
+
// Legacy v1 layout stores task metadata at checkpoint root: <cp>/tasks/<tool-use-id>/...
- // Prefer attaching this tree to the latest session in v2.
- if rootTasksTree, rootTasksErr := v1Tree.Tree("tasks"); rootTasksErr == nil {
- if len(summary.Sessions) > 0 {
- latestSessionIdx := len(summary.Sessions) - 1
- if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, latestSessionIdx, rootTasksTree.Hash); spliceErr != nil {
- return fmt.Errorf("latest session task tree splice failed: %w", spliceErr)
- }
+ // Attach this to the latest session in v2, and merge with that session's own tasks if present.
+ var rootTasksTree *object.Tree
+ rootTasksSpliced := false
+ if latestSessionIdx >= 0 {
+ if tasksTree, rootTasksErr := v1Tree.Tree("tasks"); rootTasksErr == nil {
+ rootTasksTree = tasksTree
}
}
@@ -382,11 +386,33 @@
continue // No tasks directory in this session
}
- if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, sessionIdx, tasksTree.Hash); spliceErr != nil {
+ tasksTreeHash := tasksTree.Hash
+ if rootTasksTree != nil && sessionIdx == latestSessionIdx {
+ mergedTasksTreeHash, mergeErr := checkpoint.UpdateSubtree(
+ repo,
+ rootTasksTree.Hash,
+ nil,
+ tasksTree.Entries,
+ checkpoint.UpdateSubtreeOptions{MergeMode: checkpoint.MergeKeepExisting},
+ )
+ if mergeErr != nil {
+ return fmt.Errorf("failed to merge root and session task trees for session %d: %w", sessionIdx, mergeErr)
+ }
+ tasksTreeHash = mergedTasksTreeHash
+ rootTasksSpliced = true
+ }
+
+ if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, sessionIdx, tasksTreeHash); spliceErr != nil {
return fmt.Errorf("session %d task tree splice failed: %w", sessionIdx, spliceErr)
}
}
+ if rootTasksTree != nil && !rootTasksSpliced {
+ if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, latestSessionIdx, rootTasksTree.Hash); spliceErr != nil {
+ return fmt.Errorf("latest session task tree splice failed: %w", spliceErr)
+ }
+ }
+
return nil
}
diff --git a/cmd/entire/cli/migrate_test.go b/cmd/entire/cli/migrate_test.go
--- a/cmd/entire/cli/migrate_test.go
+++ b/cmd/entire/cli/migrate_test.go
@@ -228,6 +228,52 @@
require.NoError(t, taskFileErr, "expected migrated task checkpoint metadata in /full/current")
}
+func TestMigrateCheckpointsV2_TaskMetadataMergesRootAndSessionTasks(t *testing.T) {
+ t.Parallel()
+ repo := initMigrateTestRepo(t)
+ v1Store, v2Store := newMigrateStores(repo)
+
+ cpID := id.MustCheckpointID("c1d2e3f4a5b6")
+
+ metadataDir := t.TempDir()
+ sessionTaskFile := filepath.Join(metadataDir, "tasks", "toolu_01SESSION", "checkpoint.json")
+ require.NoError(t, os.MkdirAll(filepath.Dir(sessionTaskFile), 0o755))
+ require.NoError(t, os.WriteFile(sessionTaskFile, []byte(`{"source":"session"}`), 0o644))
+
+ // Write one v1 task checkpoint that has both:
+ // 1) root-level task metadata (legacy layout, from IsTask/ToolUseID)
+ // 2) session-level task metadata (from MetadataDir copy into session subtree)
+ err := v1Store.WriteCommitted(context.Background(), checkpoint.WriteCommittedOptions{
+ CheckpointID: cpID,
+ SessionID: "session-task-merge-001",
+ Strategy: "manual-commit",
+ Transcript: []byte("{\"type\":\"assistant\",\"message\":\"task merge\"}\n"),
+ Prompts: []string{"task merge prompt"},
+ IsTask: true,
+ ToolUseID: "toolu_01ROOT",
+ MetadataDir: metadataDir,
+ AuthorName: "Test",
+ AuthorEmail: "test@test.com",
+ })
+ require.NoError(t, err)
+
+ var stdout bytes.Buffer
+ result, migrateErr := migrateCheckpointsV2(context.Background(), repo, v1Store, v2Store, &stdout)
+ require.NoError(t, migrateErr)
+ assert.Equal(t, 1, result.migrated)
+
+ _, rootTreeHash, refErr := v2Store.GetRefState(plumbing.ReferenceName(paths.V2FullCurrentRefName))
+ require.NoError(t, refErr)
+ rootTree, treeErr := repo.TreeObject(rootTreeHash)
+ require.NoError(t, treeErr)
+
+ // Both root-level and per-session tasks must exist after migration.
+ _, rootTaskErr := rootTree.File(cpID.Path() + "/0/tasks/toolu_01ROOT/checkpoint.json")
+ require.NoError(t, rootTaskErr, "expected root-level task metadata in /full/current")
+ _, sessionTaskErr := rootTree.File(cpID.Path() + "/0/tasks/toolu_01SESSION/checkpoint.json")
+ require.NoError(t, sessionTaskErr, "expected session-level task metadata in /full/current")
+}
+
func TestMigrateCheckpointsV2_AllSkippedOnRerun(t *testing.T) {
t.Parallel()
repo := initMigrateTestRepo(t)This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit d7e367f. Configure here.
cmd/entire/cli/migrate.go
Outdated
|
|
||
| // Already in v2 — check if any sessions are missing transcript.jsonl and backfill | ||
| if existing != nil { | ||
| return backfillCompactTranscripts(ctx, v1Store, v2Store, info, existing, out, prefix) |
There was a problem hiding this comment.
I'm wondering if we should add a way to validate the contents of the v2 checkpoint more thoroughly: Since we can't write /main and /full/current in a transactional manner, it might be possible that someone interrupts the migrate command and restarts it later and we'd assume that a checkpoint that was only partially migrated was fully migrated.
Would it be useful to do some spot checks like checking the number of sessions or the presence of the full.jsonl transcript before trying to attempt to back fill the compact transcript?
There was a problem hiding this comment.
Pushed up a change to look for any missing components beyond just the transcript.jsonl which I think addresses this case. Thank you for calling it out - previously it would have only checked for and regenerated a missing transcript.jsonl.
Entire-Checkpoint: 51d95c3209d7


Adds a (hidden) migration CLI command that allows for a
checkpointsparameter to be passed in to migrate from"v1"to"v2". I think this will be automated for end users when we are ready to green light v2, but for now, it's handy for testing.Follows the
migratecommand validation for a test repo as written in https://github.com/entireio/cli/pull/839/changes#diff-f8101f182954049d140980fd56caf1e09e5c85a771f21c972d986ce2229d7e6eR439.Validated that for checkpoints with a transcript.jsonl in v1, the full.jsonl + transcript.jsonl files were created in the proper places for v2.
Rerunning the command skips checkpoints that are already migrated + shows which transcript.jsonl files couldn't be created.
Inspecting an example transcript.jsonl file:
Note
Medium Risk
Introduces new migration logic that writes to v2 checkpoint refs and performs git tree/commit surgery, which could affect checkpoint data integrity if bugs exist; command is hidden and guarded by an explicit flag, limiting user impact.
Overview
Adds a hidden
entire migrate --checkpoints v2command to bulk-migrate committed checkpoints from v1 storage into the v2 refs.Migration iterates v1 checkpoints, writes each session into v2 (optionally generating
transcript.jsonlvia compaction), and is idempotent by skipping already-migrated checkpoints while backfilling missing compact transcripts when possible. For task checkpoints, it also copies task metadata trees into v2/full/currentvia subtree updates and commits.Separately standardizes prompt serialization by introducing
PromptSeparator,JoinPrompts, andSplitPromptContent, switching existing v1/v2 checkpoint writers to use the shared join helper and adding focused tests for prompt round-tripping and migration behavior.Reviewed by Cursor Bugbot for commit d7e367f. Configure here.