Skip to content

Preserve iteration work when agent final JSON is malformed #147

@Arcadia822

Description

@Arcadia822

Problem

GNHF currently treats an unparsable agent final response as an iteration error and resets the repository before preserving the useful code changes from that iteration. On current main (208c424 / gnhf-v0.1.41), the Copilot path still appends the final JSON contract in src/core/agents/copilot.ts#L113 and then parses the final assistant message with JSON.parse(stripJsonFence(lastAgentMessage)) in src/core/agents/copilot.ts#L287. That handles pure JSON and a whole-message fenced JSON block, but it still rejects prose followed by JSON.

This is especially fragile for long Copilot/OpenCode-style runs. The agent may complete the implementation and verification, then finish with natural-language summary plus JSON, fenced JSON after prose, or otherwise non-pure JSON. GNHF then fails to parse the final message, records the iteration as an error through src/core/orchestrator.ts#L571, and reaches recordFailure(...), which calls resetHard(this.cwd) in src/core/orchestrator.ts#L683. resetHard(...) then runs git reset --hard HEAD and git clean -fd in src/core/git.ts#L293, discarding the code even when the work itself was valid.

Observed example from a local development task:

  • Iteration 1 completed typecheck and tests, then ended with Good - all... followed by JSON.
  • GNHF failed with Failed to parse copilot output: Unexpected token 'G', "Good - all"... is not valid JSON.
  • Iteration 2 completed typecheck and tests, then ended with This is a... followed by inline JSON.
  • GNHF failed with Failed to parse copilot output: Unexpected token 'T', "This is a "... is not valid JSON.
  • Because no valid control JSON was parsed, GNHF treated both iterations as agent errors and reset the working tree.

The failure rate appears to be model-dependent. In my local usage, GPT-5.4 runs failed the final JSON message contract roughly once every three runs, while switching the same workflow to GLM-5.1 pushed the observed failure rate above 80%. This is only a qualitative/experiential estimate, not a carefully measured benchmark, but it is frequent enough that the current reset behavior becomes a practical data-loss risk rather than a rare edge case.

This is not primarily an implementation failure. It is a protocol-boundary failure: task execution and orchestrator control output are coupled into one long agent turn. There is already shared recovery machinery in src/core/agents/json-extract.ts#L88 that can extract a schema-looking JSON object from prose, and it is used by other adapters, but the Copilot path above does not currently use it.

Desired Behavior

When the agent output is not parseable as the required GNHF JSON, GNHF should not immediately discard the iteration's code changes.

Instead:

  1. Before any reset/clean after an agent error, save a binary patch of the current working tree.
  2. If the error is only final-output JSON parsing, ask the same agent for a short JSON-only follow-up.
  3. Only reset the worktree if the JSON finalizer also fails or reports success=false.

Proposed MVP Fix

1. Save a rescue patch before reset

Before calling resetHard(this.cwd) for any failed iteration that may have modified files, save a recovery artifact from the orchestrator failure path. The relevant reset sites are recordFailure(...) in src/core/orchestrator.ts#L683 and the direct abort/permanent-error resets around src/core/orchestrator.ts#L529; the low-level destructive behavior is centralized in src/core/git.ts#L293.

  • Run git diff --binary.
  • Write it to the run directory, for example:
    • .gnhf/runs/<run-id>/iteration-<n>-rescue.patch
  • Record the patch path in gnhf.log and notes.md. The run directory is created in src/core/run.ts#L190, and debug events can be emitted through appendDebugLog(...) in src/core/debug-log.ts#L70.
  • If there is no diff, skip writing an empty patch.

This must happen before git reset --hard and git clean -fd.

2. Add a JSON finalizer retry

If the agent process exits successfully but parsing the final output fails, do not immediately let the adapter throw an unrecoverable parse error into the orchestrator. For Copilot, the current throw is in src/core/agents/copilot.ts#L293; for OpenCode, a comparable parse-error path is in src/core/agents/opencode.ts#L1106. The MVP can first reuse parseAgentJson(...) from src/core/agents/json-extract.ts#L88 where appropriate, then fall back to one JSON-only finalizer retry when the final message still cannot be parsed or validated by validateAgentOutput(...) in src/core/agents/types.ts#L28.

  • Do not immediately call resetHard.
  • Send a short follow-up prompt to the same agent:
Your previous response was not valid JSON for the GNHF final output contract.
Do not edit files.
Return only one valid JSON object matching this schema:
<schema>
No Markdown. No prose. No code fences.
  • Parse the follow-up response.
  • If it parses, continue normal iteration handling.
  • If it fails, then mark the iteration as failed, but keep the rescue patch before resetting.

Limit this to 1 retry for the MVP.

3. Keep reset behavior, but make it recoverable

GNHF can still reset failed iterations. The change is that failure should leave a durable recovery artifact.

This preserves the current deterministic cleanup model while avoiding silent loss of valid work.

Local Validation Plan

I will implement this change locally first and test the effect before proposing it upstream. The initial local validation will focus on whether malformed final messages can be recovered without losing the iteration diff, and whether the rescue patch path makes failed iterations manually recoverable when the finalizer still cannot produce valid control JSON.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions