BP mode fails with gpt-5.4-mini: model doesn't follow babysitter orchestration instructions

## Summary

ALL agents fail BP/Predefined and BP/Create Interactive with gpt-5.4-mini on Ubuntu. The model produces output and writes the file, but doesn't go through the babysitter SDK run/iterate lifecycle.

## Verification Report (codex + gpt-5.4-mini BP/Predefined)

- model-response: PASS (558K chars)
- file-creation: PASS (3487 bytes)
- stop-hooks: PASS
- hooks-adapter-session: PASS
- **babysitter-run-completion: FAIL** — no .a5c/runs/ directory
- **babysitter-completion-proof: FAIL** — no .a5c/runs/ directory

## Root Cause

gpt-5.4-mini ignores the `$babysitter:yolo` command prefix and does the task directly instead of invoking the babysitter plugin. The same agents pass with gpt-5.5 and gemini-3.1-pro-preview, confirming this is a model capability issue.

## Affected Cells

12 cells: BP/Predefined Interactive (6 agents), BP/Predefined BH (5 agents), BP/Create Interactive (2 agents) — all Ubuntu, all gpt-5.4-mini.

## Possible Fixes

1. Stronger babysitter invocation prompt for smaller models
2. Force babysitter invocation via a pre-hook that wraps the prompt
3. Mark gpt-5.4-mini as unsupported for BP mode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BP mode fails with gpt-5.4-mini: model doesn't follow babysitter orchestration instructions #947

Summary

Verification Report (codex + gpt-5.4-mini BP/Predefined)

Root Cause

Affected Cells

Possible Fixes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

BP mode fails with gpt-5.4-mini: model doesn't follow babysitter orchestration instructions #947

Description

Summary

Verification Report (codex + gpt-5.4-mini BP/Predefined)

Root Cause

Affected Cells

Possible Fixes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions