Add Codex reviewer backend to structured review by resodo · Pull Request #15 · resodo/agent-protocols

resodo · 2026-06-10T05:07:40Z

Summary

Add a cross-vendor reviewer backend to the bundled structured-review runner so a Claude Code driver gets a Codex reviewer by default (and vice versa):

codex exec backend: JSONL event parsing, last-message-file-first text capture, sandboxed write-mode commits via git-dir --add-dir (linked worktrees supported), --sandbox read-only print mode.
--reviewer-backend {auto,claude,codex} with cross-vendor auto keyed off driver env markers (CLAUDECODE vs CODEX_THREAD_ID/CODEX_SANDBOX); both markers -> hard error; neither -> claude (today's behavior); marker-resolved selections preflight the binary with an actionable error. No silent fallback.
Pinned Codex defaults gpt-5.5/xhigh with --codex-bin/--codex-model/--codex-effort overrides; --model/--effort/--claude-bin stay Claude-only for backward compatibility.
Backend-aware metadata (backend, active model/effort, reviewer_version, best-effort token usage) and reviewer identity in every pass heading.
verify_write_mode/verify_print_mode semantics unchanged — the same post-hoc contract gates both backends.
Protocol text backend-neutral: SKILL Default Runner Rule, README usage, CURRENT map.

Gates (strict chain, all in the plan thread)

Plan review (claude reviewer): no blocking; 5 non-blocking refinements accepted in driver response 1.
Implementation review (codex reviewer, dogfooded): the new backend reviewed its own implementation live — real codex exec under --sandbox workspace-write + --add-dir on a linked worktree appended threads and committed; runner verification passed. Verdict: no blocking, no non-blocking, "Ready for closeout", full traceability, reviewer-rerun validation green.

Validation

structured-review/tests 56 OK (16 new), tests 15 OK, scout/tests 15 OK, check_backlog.py pass, compileall clean, git diff --check clean — driver-run and reviewer-rerun; CI runs the same set.

Notes

closeout/SKILL.md still says "bundled Claude runner" in one line; protocol changes there were an explicit plan non-goal. Optional follow-up backlog item.
Live spike + dogfood evidence covers real Codex sandbox semantics that fake-bin tests cannot reach.

🤖 Generated with Claude Code

Add the implementation plan for a cross-vendor reviewer backend in the bundled structured-review runner: codex exec integration, driver-keyed auto default, pinned gpt-5.5/xhigh, reviewer identity in thread headings, and backend-neutral protocol text. Spike evidence included. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Accept all five non-blocking findings: name the finalize text-capture seam and preserve module-level helper names (N1), pin print-mode -o behavior with read-only-sandbox spike evidence (N2, N3), specify backend-aware metadata semantics with reviewer_version (N4), and add a marker-resolved auto preflight with an actionable error (N5). Mark the plan accepted and implementation in progress. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Add a codex exec reviewer backend to the bundled runner with a cross-vendor auto default keyed off driver env markers, marker-resolved preflight, pinned gpt-5.5/xhigh defaults with override flags, sandboxed write-mode commits via git-dir --add-dir, JSONL parsing with last-message-file preference, backend-aware metadata with reviewer_version, and reviewer identity in pass headings. Update the skill, README, and CURRENT map to the backend-neutral default runner rule; extend tests to 56 with selection, argv, parsing, preflight, and fake-codex end-to-end coverage. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…mplementation

Mark the plan historical with its PR reference, add it to the agent plans index, and refresh the CURRENT map timestamp for the backend- neutral runner entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Reviewer agents frequently violated the append-only thread contract (deleting placeholder lines while appending), invalidating roughly half of historical write-mode passes and forcing manual driver recovery. The reviewer now always runs read-only and returns the review text; the runner verifies it (untouched worktree, review-like, no local paths or secrets), appends it verbatim under Review Threads, and creates the structured-review commit itself, so append-only holds by construction. Codex drops the write-mode workspace-write/--add-dir grant and runs read-only in both modes. verify_write_mode stays as a defense-in-depth self-check. Also delegate closeout's runner policy to structured-review per the human-approved scope addendum. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…unner-write refactor

Accept pass-3 B1: append the reviewer text byte-for-byte (only a separating blank line and a missing terminating newline are added) with an exact-output test. Treat the pass-3 environment-blocked validation as a finding: codex now runs --sandbox workspace-write with no .git grant in both modes so reviewers can run tests again, while commits stay impossible and stray worktree writes still fail the run. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…unner-write fixes

resodo · 2026-06-10T10:40:09Z

Scope addendum (human-approved during closeout discussion), now included:

closeout/SKILL.md delegation — the duplicated (and stale) runner policy is removed; closeout keeps its trigger list and defers runner/backend/mode policy to structured-review.
Runner-owned write mode — the reviewer always runs read-only-for-the-repo and returns the review text; the runner verifies it (untouched worktree, review-like, no local paths/secrets), appends it byte-for-byte under ## Review Threads, and creates the commit itself. The append-only contract that reviewer agents historically violated in ~half of write-mode runs now holds by construction. Codex runs --sandbox workspace-write with no .git grant in both modes, so reviewers can run tests while commits remain impossible.

Gate evidence in the plan thread: pass 3 (codex) found one blocking issue (non-verbatim append) plus an environment-blocked-validation residual — both fixed; pass 4 (codex) confirmed: no blocking, no non-blocking, "Ready for closeout", with the reviewer rerunning the full validation suite itself under the new sandbox. Validation green at e8919dd: 61+15+15 tests, backlog check, compileall, git diff --check.

resodo and others added 10 commits June 10, 2026 06:38

structured-review: add reviewer comments for Codex reviewer backend plan

16ca168

structured-review: add reviewer comments for Codex reviewer backend i…

1c4aeb7

…mplementation

docs: close out Codex reviewer backend plan

739c30b

Mark the plan historical with its PR reference, add it to the agent plans index, and refresh the CURRENT map timestamp for the backend- neutral runner entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

structured-review: add reviewer comments for Codex reviewer backend r…

c768805

…unner-write refactor

structured-review: add reviewer comments for Codex reviewer backend r…

e8919dd

…unner-write fixes

resodo merged commit 4bb87b0 into main Jun 10, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Codex reviewer backend to structured review#15

Add Codex reviewer backend to structured review#15
resodo merged 10 commits into
mainfrom
feature/codex-reviewer-backend

resodo commented Jun 10, 2026

Uh oh!

resodo commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

resodo commented Jun 10, 2026

Summary

Gates (strict chain, all in the plan thread)

Validation

Notes

Uh oh!

resodo commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant