genty live-stack: non-gpt-5.5 models pass orchestration but skip the file write (file-creation fails)

## Summary

After #936 was fully resolved, genty `vanilla NI` is **GREEN on gpt-5.5 across all 3 OSes** ([Ubuntu](https://github.com/a5c-ai/babysitter/actions/runs/27366900253), [macOS](https://github.com/a5c-ai/babysitter/actions/runs/27372411571), [Windows](https://github.com/a5c-ai/babysitter/actions/runs/27372413000)).

But the **other 5 models fail on the `file-creation` check only** (gpt-5.4-mini, claude-sonnet-4-6, gemini-3.5-flash, gemini-3.1-pro-preview, DeepSeek-V4-Pro) — across all 3 OSes (runs `27372409793` / `27372411571` / `27372413000`).

The #936 infrastructure all works for these models:
```
✓ model-response (agent responded, ~12k chars)
✓ proxy-communication
✗ file-creation: agent did not create .a5c-live-test/<id>-odyssey.md (output: 12092 chars)
✓ babysitter-run-completion: run exists with 6 journal events (>=5)
✓ babysitter-completion-proof: completed with processId + completionProof
```

## Diagnosis

The model produces the full odyssey content as **agent output text** (~12k chars) and the run completes with a valid completion proof — but the content is never **written to the expected file** `.a5c-live-test/<sessionId>-odyssey.md`. gpt-5.5 reliably authors a process whose delegated worker writes the file; weaker/other models author/execute a process that returns the content instead of writing it (or write to the wrong path).

This is a **model-adherence / prompt-robustness** gap, not an orchestration bug. Likely fix: strengthen the authoring + delegated-worker prompts so the file write to the exact target path is mandatory and verified, independent of model strength.

## Repro (local, fast)

`AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_PROJECT_NAME` → `node packages/genty/cli/dist/cli/main.js yolo --prompt "<odyssey...save to .a5c-live-test/x.md>" --model gpt-5.4-mini --no-interactive --workspace <tmp>` and check whether the file is created.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

genty live-stack: non-gpt-5.5 models pass orchestration but skip the file write (file-creation fails) #956

Summary

Diagnosis

Repro (local, fast)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

genty live-stack: non-gpt-5.5 models pass orchestration but skip the file write (file-creation fails) #956

Description

Summary

Diagnosis

Repro (local, fast)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions