Skip to content

Checkpoints V2: add migration option#855

Open
computermode wants to merge 13 commits intomainfrom
add-migrate-v2-command
Open

Checkpoints V2: add migration option#855
computermode wants to merge 13 commits intomainfrom
add-migrate-v2-command

Conversation

@computermode
Copy link
Copy Markdown
Contributor

@computermode computermode commented Apr 4, 2026

Adds a (hidden) migration CLI command that allows for a checkpoints parameter to be passed in to migrate from "v1" to "v2". I think this will be automated for end users when we are ready to green light v2, but for now, it's handy for testing.

Follows the migrate command validation for a test repo as written in https://github.com/entireio/cli/pull/839/changes#diff-f8101f182954049d140980fd56caf1e09e5c85a771f21c972d986ce2229d7e6eR439.

Validated that for checkpoints with a transcript.jsonl in v1, the full.jsonl + transcript.jsonl files were created in the proper places for v2.
Rerunning the command skips checkpoints that are already migrated + shows which transcript.jsonl files couldn't be created.

➜  test-repo git:(entire/checkpoints/v1) ✗ git show-ref -- refs/entire/checkpoints/v2/main
c1641139fb51a64faec6434e804368e8d8006e00 refs/entire/checkpoints/v2/main

➜  test-repo git:(entire/checkpoints/v1) ✗ git show-ref -- refs/entire/checkpoints/v2/full/current
faf645d2b1f21aa6e019f6252a7d5d521a7acb4e refs/entire/checkpoints/v2/full/current

Migrating v1 checkpoints to v2...
  [1/25] Migrating checkpoint 528c39637ed4... skipped (already in v2)
  [2/25] Migrating checkpoint e8e062073afb... skipped (already in v2)
  [3/25] Migrating checkpoint 3ced2b94d513... skipped (already in v2)
  [4/25] Migrating checkpoint 223e6f793eed... skipped (already in v2)
  [5/25] Migrating checkpoint 699504f045e2... in v2, but transcript.jsonl could not be generated: agent "Copilot CLI"
  [6/25] Migrating checkpoint a0f8612fcb64... added transcript.jsonl for 1 session(s)
  [7/25] Migrating checkpoint f55b1d3b434a... skipped (already in v2)
  [8/25] Migrating checkpoint 943126df1f2a... skipped (already in v2)
  [9/25] Migrating checkpoint e6b2f769a273... skipped (already in v2)
  [10/25] Migrating checkpoint 4be62319bf98... skipped (already in v2)
  [11/25] Migrating checkpoint b8b6fbd5f55f... skipped (already in v2)
  [12/25] Migrating checkpoint b1b2d74d8ef1... added transcript.jsonl for 1 session(s)
  [13/25] Migrating checkpoint 6215534dfb8e... skipped (already in v2)
  [14/25] Migrating checkpoint 5a31697679ee... skipped (already in v2)
  [15/25] Migrating checkpoint 581d7b2848de... skipped (already in v2)
  [16/25] Migrating checkpoint 2151d9f18a50... skipped (already in v2)
  [17/25] Migrating checkpoint ecf782729563... skipped (already in v2)
  [18/25] Migrating checkpoint c81467b72ca0... skipped (already in v2)
  [19/25] Migrating checkpoint c8eb421389c0... skipped (already in v2)
  [20/25] Migrating checkpoint 14d75ebe3a6d... skipped (already in v2)
  [21/25] Migrating checkpoint 729ac91fce19... skipped (already in v2)
  [22/25] Migrating checkpoint 79d52455f160... skipped (already in v2)
  [23/25] Migrating checkpoint edd2df3466e5... skipped (already in v2)
  [24/25] Migrating checkpoint a18a21695d2e... skipped (already in v2)
  [25/25] Migrating checkpoint 6284d02bfdae... skipped (already in v2)

Inspecting an example transcript.jsonl file:

git show refs/entire/checkpoints/v2/main:f5/5b1d3b434a/0/transcript.jsonl

{"v":1,"agent":"Cursor","cli_version":"dev","type":"user","content":[{"text":"create a markdown file stating this is for testing cursor"}]}
{"v":1,"agent":"Cursor","cli_version":"dev","type":"assistant","content":[{"text":"The user wants me to create a markdown file stating this is for testing cursor. Let me create it in the workspace directory.","type":"text"},{"input":{"path":"/Users/ninawork/entire/test-repos/test-repo/cursor-attach/testing-cursor.md","contents":"# Testing Cursor\n\nThis file is for testing Cursor.\n"},"name":"Write","type":"tool_use"},{"text":"Created `testing-cursor.md` in the `cursor-attach` directory.","type":"text"}]}
{"v":1,"agent":"Cursor","cli_version":"dev","type":"user","content":[{"text":"how do I quit this cli"}]}
{"v":1,"agent":"Cursor","cli_version":"dev","type":"assistant","content":[{"text":"You can quit by pressing **Ctrl+C** or typing **exit**.","type":"text"}]}

Note

Medium Risk
Introduces new migration logic that writes to v2 checkpoint refs and performs git tree/commit surgery, which could affect checkpoint data integrity if bugs exist; command is hidden and guarded by an explicit flag, limiting user impact.

Overview
Adds a hidden entire migrate --checkpoints v2 command to bulk-migrate committed checkpoints from v1 storage into the v2 refs.

Migration iterates v1 checkpoints, writes each session into v2 (optionally generating transcript.jsonl via compaction), and is idempotent by skipping already-migrated checkpoints while backfilling missing compact transcripts when possible. For task checkpoints, it also copies task metadata trees into v2 /full/current via subtree updates and commits.

Separately standardizes prompt serialization by introducing PromptSeparator, JoinPrompts, and SplitPromptContent, switching existing v1/v2 checkpoint writers to use the shared join helper and adding focused tests for prompt round-tripping and migration behavior.

Reviewed by Cursor Bugbot for commit d7e367f. Configure here.

Entire-Checkpoint: 209a37190167
Copilot AI review requested due to automatic review settings April 4, 2026 00:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an initial entire migrate CLI command intended to migrate v1 checkpoints to the v2 checkpoint ref/layout for testing and rollout prep.

Changes:

  • Registers a new migrate subcommand on the root CLI.
  • Introduces v1→v2 checkpoint migration logic, including transcript compaction and attempted task-metadata tree copying.
  • Adds unit tests covering basic migration flows and idempotency.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
cmd/entire/cli/root.go Registers the new migrate command in the root CLI.
cmd/entire/cli/migrate.go Implements v1→v2 migration logic, transcript compaction, and task metadata tree splicing.
cmd/entire/cli/migrate_test.go Adds tests for migration behavior (basic/idempotent/multi-session/flag validation).

Entire-Checkpoint: c9595c52ab4a
Base automatically changed from feat/checkpoints-v2-push-logic to main April 6, 2026 20:57
Entire-Checkpoint: 9f07aeebbf93
Entire-Checkpoint: f1c37c8efc47
@computermode
Copy link
Copy Markdown
Contributor Author

bugbot run

@computermode computermode changed the title WIP: Checkpoints V2: add migration option Checkpoints V2: add migration option Apr 7, 2026
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Root-level tasks overwritten by per-session tasks splice
    • copyTaskMetadataToV2 now merges root-level and latest-session task trees before splicing so both task sets are preserved instead of one overwriting the other.

Create PR

Or push these changes by commenting:

@cursor push 033e8c1ad0
Preview (033e8c1ad0)
diff --git a/cmd/entire/cli/migrate.go b/cmd/entire/cli/migrate.go
--- a/cmd/entire/cli/migrate.go
+++ b/cmd/entire/cli/migrate.go
@@ -359,14 +359,18 @@
 		return err
 	}
 
+	latestSessionIdx := -1
+	if len(summary.Sessions) > 0 {
+		latestSessionIdx = len(summary.Sessions) - 1
+	}
+
 	// Legacy v1 layout stores task metadata at checkpoint root: <cp>/tasks/<tool-use-id>/...
-	// Prefer attaching this tree to the latest session in v2.
-	if rootTasksTree, rootTasksErr := v1Tree.Tree("tasks"); rootTasksErr == nil {
-		if len(summary.Sessions) > 0 {
-			latestSessionIdx := len(summary.Sessions) - 1
-			if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, latestSessionIdx, rootTasksTree.Hash); spliceErr != nil {
-				return fmt.Errorf("latest session task tree splice failed: %w", spliceErr)
-			}
+	// Attach this to the latest session in v2, and merge with that session's own tasks if present.
+	var rootTasksTree *object.Tree
+	rootTasksSpliced := false
+	if latestSessionIdx >= 0 {
+		if tasksTree, rootTasksErr := v1Tree.Tree("tasks"); rootTasksErr == nil {
+			rootTasksTree = tasksTree
 		}
 	}
 
@@ -382,11 +386,33 @@
 			continue // No tasks directory in this session
 		}
 
-		if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, sessionIdx, tasksTree.Hash); spliceErr != nil {
+		tasksTreeHash := tasksTree.Hash
+		if rootTasksTree != nil && sessionIdx == latestSessionIdx {
+			mergedTasksTreeHash, mergeErr := checkpoint.UpdateSubtree(
+				repo,
+				rootTasksTree.Hash,
+				nil,
+				tasksTree.Entries,
+				checkpoint.UpdateSubtreeOptions{MergeMode: checkpoint.MergeKeepExisting},
+			)
+			if mergeErr != nil {
+				return fmt.Errorf("failed to merge root and session task trees for session %d: %w", sessionIdx, mergeErr)
+			}
+			tasksTreeHash = mergedTasksTreeHash
+			rootTasksSpliced = true
+		}
+
+		if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, sessionIdx, tasksTreeHash); spliceErr != nil {
 			return fmt.Errorf("session %d task tree splice failed: %w", sessionIdx, spliceErr)
 		}
 	}
 
+	if rootTasksTree != nil && !rootTasksSpliced {
+		if spliceErr := spliceTasksTreeToV2(repo, v2Store, cpID, latestSessionIdx, rootTasksTree.Hash); spliceErr != nil {
+			return fmt.Errorf("latest session task tree splice failed: %w", spliceErr)
+		}
+	}
+
 	return nil
 }
 

diff --git a/cmd/entire/cli/migrate_test.go b/cmd/entire/cli/migrate_test.go
--- a/cmd/entire/cli/migrate_test.go
+++ b/cmd/entire/cli/migrate_test.go
@@ -228,6 +228,52 @@
 	require.NoError(t, taskFileErr, "expected migrated task checkpoint metadata in /full/current")
 }
 
+func TestMigrateCheckpointsV2_TaskMetadataMergesRootAndSessionTasks(t *testing.T) {
+	t.Parallel()
+	repo := initMigrateTestRepo(t)
+	v1Store, v2Store := newMigrateStores(repo)
+
+	cpID := id.MustCheckpointID("c1d2e3f4a5b6")
+
+	metadataDir := t.TempDir()
+	sessionTaskFile := filepath.Join(metadataDir, "tasks", "toolu_01SESSION", "checkpoint.json")
+	require.NoError(t, os.MkdirAll(filepath.Dir(sessionTaskFile), 0o755))
+	require.NoError(t, os.WriteFile(sessionTaskFile, []byte(`{"source":"session"}`), 0o644))
+
+	// Write one v1 task checkpoint that has both:
+	// 1) root-level task metadata (legacy layout, from IsTask/ToolUseID)
+	// 2) session-level task metadata (from MetadataDir copy into session subtree)
+	err := v1Store.WriteCommitted(context.Background(), checkpoint.WriteCommittedOptions{
+		CheckpointID: cpID,
+		SessionID:    "session-task-merge-001",
+		Strategy:     "manual-commit",
+		Transcript:   []byte("{\"type\":\"assistant\",\"message\":\"task merge\"}\n"),
+		Prompts:      []string{"task merge prompt"},
+		IsTask:       true,
+		ToolUseID:    "toolu_01ROOT",
+		MetadataDir:  metadataDir,
+		AuthorName:   "Test",
+		AuthorEmail:  "test@test.com",
+	})
+	require.NoError(t, err)
+
+	var stdout bytes.Buffer
+	result, migrateErr := migrateCheckpointsV2(context.Background(), repo, v1Store, v2Store, &stdout)
+	require.NoError(t, migrateErr)
+	assert.Equal(t, 1, result.migrated)
+
+	_, rootTreeHash, refErr := v2Store.GetRefState(plumbing.ReferenceName(paths.V2FullCurrentRefName))
+	require.NoError(t, refErr)
+	rootTree, treeErr := repo.TreeObject(rootTreeHash)
+	require.NoError(t, treeErr)
+
+	// Both root-level and per-session tasks must exist after migration.
+	_, rootTaskErr := rootTree.File(cpID.Path() + "/0/tasks/toolu_01ROOT/checkpoint.json")
+	require.NoError(t, rootTaskErr, "expected root-level task metadata in /full/current")
+	_, sessionTaskErr := rootTree.File(cpID.Path() + "/0/tasks/toolu_01SESSION/checkpoint.json")
+	require.NoError(t, sessionTaskErr, "expected session-level task metadata in /full/current")
+}
+
 func TestMigrateCheckpointsV2_AllSkippedOnRerun(t *testing.T) {
 	t.Parallel()
 	repo := initMigrateTestRepo(t)

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit d7e367f. Configure here.

@computermode computermode marked this pull request as ready for review April 7, 2026 20:55
@computermode computermode requested a review from a team as a code owner April 7, 2026 20:55

// Already in v2 — check if any sessions are missing transcript.jsonl and backfill
if existing != nil {
return backfillCompactTranscripts(ctx, v1Store, v2Store, info, existing, out, prefix)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should add a way to validate the contents of the v2 checkpoint more thoroughly: Since we can't write /main and /full/current in a transactional manner, it might be possible that someone interrupts the migrate command and restarts it later and we'd assume that a checkpoint that was only partially migrated was fully migrated.

Would it be useful to do some spot checks like checking the number of sessions or the presence of the full.jsonl transcript before trying to attempt to back fill the compact transcript?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed up a change to look for any missing components beyond just the transcript.jsonl which I think addresses this case. Thank you for calling it out - previously it would have only checked for and regenerated a missing transcript.jsonl.

Entire-Checkpoint: 730e93f6b572
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants