Problem
GNHF currently treats an unparsable agent final response as an iteration error and resets the repository before preserving the useful code changes from that iteration. On current main (208c424 / gnhf-v0.1.41), the Copilot path still appends the final JSON contract in src/core/agents/copilot.ts#L113 and then parses the final assistant message with JSON.parse(stripJsonFence(lastAgentMessage)) in src/core/agents/copilot.ts#L287. That handles pure JSON and a whole-message fenced JSON block, but it still rejects prose followed by JSON.
This is especially fragile for long Copilot/OpenCode-style runs. The agent may complete the implementation and verification, then finish with natural-language summary plus JSON, fenced JSON after prose, or otherwise non-pure JSON. GNHF then fails to parse the final message, records the iteration as an error through src/core/orchestrator.ts#L571, and reaches recordFailure(...), which calls resetHard(this.cwd) in src/core/orchestrator.ts#L683. resetHard(...) then runs git reset --hard HEAD and git clean -fd in src/core/git.ts#L293, discarding the code even when the work itself was valid.
Observed example from a local development task:
- Iteration 1 completed typecheck and tests, then ended with
Good - all... followed by JSON.
- GNHF failed with
Failed to parse copilot output: Unexpected token 'G', "Good - all"... is not valid JSON.
- Iteration 2 completed typecheck and tests, then ended with
This is a... followed by inline JSON.
- GNHF failed with
Failed to parse copilot output: Unexpected token 'T', "This is a "... is not valid JSON.
- Because no valid control JSON was parsed, GNHF treated both iterations as agent errors and reset the working tree.
The failure rate appears to be model-dependent. In my local usage, GPT-5.4 runs failed the final JSON message contract roughly once every three runs, while switching the same workflow to GLM-5.1 pushed the observed failure rate above 80%. This is only a qualitative/experiential estimate, not a carefully measured benchmark, but it is frequent enough that the current reset behavior becomes a practical data-loss risk rather than a rare edge case.
This is not primarily an implementation failure. It is a protocol-boundary failure: task execution and orchestrator control output are coupled into one long agent turn. There is already shared recovery machinery in src/core/agents/json-extract.ts#L88 that can extract a schema-looking JSON object from prose, and it is used by other adapters, but the Copilot path above does not currently use it.
Desired Behavior
When the agent output is not parseable as the required GNHF JSON, GNHF should not immediately discard the iteration's code changes.
Instead:
- Before any reset/clean after an agent error, save a binary patch of the current working tree.
- If the error is only final-output JSON parsing, ask the same agent for a short JSON-only follow-up.
- Only reset the worktree if the JSON finalizer also fails or reports
success=false.
Proposed MVP Fix
1. Save a rescue patch before reset
Before calling resetHard(this.cwd) for any failed iteration that may have modified files, save a recovery artifact from the orchestrator failure path. The relevant reset sites are recordFailure(...) in src/core/orchestrator.ts#L683 and the direct abort/permanent-error resets around src/core/orchestrator.ts#L529; the low-level destructive behavior is centralized in src/core/git.ts#L293.
- Run
git diff --binary.
- Write it to the run directory, for example:
.gnhf/runs/<run-id>/iteration-<n>-rescue.patch
- Record the patch path in
gnhf.log and notes.md. The run directory is created in src/core/run.ts#L190, and debug events can be emitted through appendDebugLog(...) in src/core/debug-log.ts#L70.
- If there is no diff, skip writing an empty patch.
This must happen before git reset --hard and git clean -fd.
2. Add a JSON finalizer retry
If the agent process exits successfully but parsing the final output fails, do not immediately let the adapter throw an unrecoverable parse error into the orchestrator. For Copilot, the current throw is in src/core/agents/copilot.ts#L293; for OpenCode, a comparable parse-error path is in src/core/agents/opencode.ts#L1106. The MVP can first reuse parseAgentJson(...) from src/core/agents/json-extract.ts#L88 where appropriate, then fall back to one JSON-only finalizer retry when the final message still cannot be parsed or validated by validateAgentOutput(...) in src/core/agents/types.ts#L28.
- Do not immediately call
resetHard.
- Send a short follow-up prompt to the same agent:
Your previous response was not valid JSON for the GNHF final output contract.
Do not edit files.
Return only one valid JSON object matching this schema:
<schema>
No Markdown. No prose. No code fences.
- Parse the follow-up response.
- If it parses, continue normal iteration handling.
- If it fails, then mark the iteration as failed, but keep the rescue patch before resetting.
Limit this to 1 retry for the MVP.
3. Keep reset behavior, but make it recoverable
GNHF can still reset failed iterations. The change is that failure should leave a durable recovery artifact.
This preserves the current deterministic cleanup model while avoiding silent loss of valid work.
Local Validation Plan
I will implement this change locally first and test the effect before proposing it upstream. The initial local validation will focus on whether malformed final messages can be recovered without losing the iteration diff, and whether the rescue patch path makes failed iterations manually recoverable when the finalizer still cannot produce valid control JSON.
Problem
GNHF currently treats an unparsable agent final response as an iteration error and resets the repository before preserving the useful code changes from that iteration. On current
main(208c424/gnhf-v0.1.41), the Copilot path still appends the final JSON contract in src/core/agents/copilot.ts#L113 and then parses the final assistant message withJSON.parse(stripJsonFence(lastAgentMessage))in src/core/agents/copilot.ts#L287. That handles pure JSON and a whole-message fenced JSON block, but it still rejects prose followed by JSON.This is especially fragile for long Copilot/OpenCode-style runs. The agent may complete the implementation and verification, then finish with natural-language summary plus JSON, fenced JSON after prose, or otherwise non-pure JSON. GNHF then fails to parse the final message, records the iteration as an error through src/core/orchestrator.ts#L571, and reaches
recordFailure(...), which callsresetHard(this.cwd)in src/core/orchestrator.ts#L683.resetHard(...)then runsgit reset --hard HEADandgit clean -fdin src/core/git.ts#L293, discarding the code even when the work itself was valid.Observed example from a local development task:
Good - all...followed by JSON.Failed to parse copilot output: Unexpected token 'G', "Good - all"... is not valid JSON.This is a...followed by inline JSON.Failed to parse copilot output: Unexpected token 'T', "This is a "... is not valid JSON.The failure rate appears to be model-dependent. In my local usage, GPT-5.4 runs failed the final JSON message contract roughly once every three runs, while switching the same workflow to GLM-5.1 pushed the observed failure rate above 80%. This is only a qualitative/experiential estimate, not a carefully measured benchmark, but it is frequent enough that the current reset behavior becomes a practical data-loss risk rather than a rare edge case.
This is not primarily an implementation failure. It is a protocol-boundary failure: task execution and orchestrator control output are coupled into one long agent turn. There is already shared recovery machinery in src/core/agents/json-extract.ts#L88 that can extract a schema-looking JSON object from prose, and it is used by other adapters, but the Copilot path above does not currently use it.
Desired Behavior
When the agent output is not parseable as the required GNHF JSON, GNHF should not immediately discard the iteration's code changes.
Instead:
success=false.Proposed MVP Fix
1. Save a rescue patch before reset
Before calling
resetHard(this.cwd)for any failed iteration that may have modified files, save a recovery artifact from the orchestrator failure path. The relevant reset sites arerecordFailure(...)in src/core/orchestrator.ts#L683 and the direct abort/permanent-error resets around src/core/orchestrator.ts#L529; the low-level destructive behavior is centralized in src/core/git.ts#L293.git diff --binary..gnhf/runs/<run-id>/iteration-<n>-rescue.patchgnhf.logandnotes.md. The run directory is created in src/core/run.ts#L190, and debug events can be emitted throughappendDebugLog(...)in src/core/debug-log.ts#L70.This must happen before
git reset --hardandgit clean -fd.2. Add a JSON finalizer retry
If the agent process exits successfully but parsing the final output fails, do not immediately let the adapter throw an unrecoverable parse error into the orchestrator. For Copilot, the current throw is in src/core/agents/copilot.ts#L293; for OpenCode, a comparable parse-error path is in src/core/agents/opencode.ts#L1106. The MVP can first reuse
parseAgentJson(...)from src/core/agents/json-extract.ts#L88 where appropriate, then fall back to one JSON-only finalizer retry when the final message still cannot be parsed or validated byvalidateAgentOutput(...)in src/core/agents/types.ts#L28.resetHard.Limit this to 1 retry for the MVP.
3. Keep reset behavior, but make it recoverable
GNHF can still reset failed iterations. The change is that failure should leave a durable recovery artifact.
This preserves the current deterministic cleanup model while avoiding silent loss of valid work.
Local Validation Plan
I will implement this change locally first and test the effect before proposing it upstream. The initial local validation will focus on whether malformed final messages can be recovered without losing the iteration diff, and whether the rescue patch path makes failed iterations manually recoverable when the finalizer still cannot produce valid control JSON.