v0.8.61: Make goal mode a real persistent work loop before adding swarm UX

## Problem

The current goal-mode surface does not yet behave like a durable, first-class work loop. It can look like a label or prompt hint while the actual turn still behaves like a normal one-shot interaction. Before adding more fanout UX on top, goal mode needs to own the objective, progress accounting, continuation decisions, and user steering contract.

This sharpens older broad trackers like #891, #1976, and #2058 into a v0.8.61 release blocker.

## Desired behavior

Goal mode should mean: there is a persisted objective, visible progress, a continuation scheduler, stop/complete/blocked semantics, and a reliable way for Hunter to steer the run while it is active. It should not depend on the model remembering everything in the conversation.

## Acceptance criteria

- `/goal` creates or resumes a persisted goal record with objective, status, started/updated timestamps, and optional budget/accounting fields.
- The TUI status surface shows the active goal, progress, and whether the agent is continuing, waiting on tools, blocked, or complete.
- User steering during a goal is queued or delivered reliably and is visible to the next continuation turn.
- Goal completion/blocking is explicit and auditable, not inferred from a normal assistant message.
- The agent can continue meaningful work across turns without re-reading the whole prompt stack every time.
- Existing goal trackers (#891, #1976, #2058) are reviewed and either linked, narrowed, retargeted, or superseded with positive comments.

## Verification

Add tests that exercise goal creation, resume, steering, completion, blocked state, and TUI projection. Add at least one dogfood scenario where a long release task continues after tool work without losing the active goal.

## Release note framing

This is not a new visual flourish. It is the foundation that makes always-on work, fleet workers, and later swarm-style UX make sense.

## Implementation plan

The code already has several partial goal systems. The v0.8.61 work is to unify them rather than adding another surface.

Current owners to inspect first:

- `crates/tui/src/commands/groups/project/goal.rs`: `/goal` command, currently backed by legacy `app.hunt` state.
- `crates/tui/src/tools/goal.rs`: model-visible `create_goal`, `get_goal`, and `update_goal` runtime tools plus continuation prompt rendering.
- `crates/tui/src/core/engine/turn_loop.rs`: `goal_continuation_message_if_needed`, currently a bounded in-turn continuation prompt injection.
- `crates/tui/src/tui/ui.rs`: `apply_goal_snapshot_to_app` bridges engine goal snapshots back into visible TUI state.
- `crates/tui/src/tui/sidebar.rs` and `crates/tui/src/context_report.rs`: visible goal projection and context reporting.
- `crates/protocol/src/lib.rs`, `crates/state/src/lib.rs`, and `crates/app-server/src/lib.rs`: persisted thread-goal protocol and app-server APIs already exist and need to become the source of durable truth.
- Reference shape, for design comparison only: `/Volumes/VIXinSSD/codex-main/codex-rs/ext/goal/src/{runtime.rs,tool.rs,steering.rs,accounting.rs}`. Do not copy blindly; adapt the architecture to CodeWhale.

Concrete changes to make:

1. **Create one goal state bridge.** Add a small service/module that can translate between `app.hunt`, `tools::goal::GoalState`, protocol `ThreadGoal`, and state-store `ThreadGoalRecord`. Avoid having each caller mutate its own goal copy.

2. **Make `/goal` durable.** When `/goal <objective>` is used, persist a thread goal through the existing state/app-server path where available, seed the runtime `SharedGoalState`, update the visible app state, and emit the same goal-updated event used by model tool updates. `/goal pause|resume|blocked|complete|clear` should go through that same bridge.

3. **Tighten model-visible tool authority.** Keep `create_goal` as model-visible only when the user explicitly asked for a persistent objective. Keep `update_goal` limited to terminal statuses the model can assert (`complete`, `blocked`) while user/system controls pause/resume/budget/usage states. Mirror the stricter blocked audit already described in the goal continuation prompt.

4. **Move continuation out of ad-hoc prompt-only behavior.** `goal_continuation_message_if_needed` should not be the only mechanism. Add a runtime continuation scheduler that can decide: continue now, wait for tools/workers, stop because budget/usage, or surface blocked/complete. The bounded in-turn continuation can remain as a guardrail, but the durable goal loop must survive across turns.

5. **Account progress.** Track tokens/time per active goal using existing fields (`token_budget`, `tokens_used`, `time_used_seconds`) instead of only `app.session.total_conversation_tokens`. Budget-limited and usage-limited states should be explicit.

6. **Make steering reliable.** If the user sends guidance while a goal is running, queue or inject it as goal-context steering for the next continuation turn. It must not disappear behind a model call, shell job, or worker wait.

7. **Update projections.** Sidebar, context inspector, status/header/footer, and app-server notifications should all read the same effective goal state. Remove mismatches where `/goal` says active but the runtime tool state says none, or vice versa.

8. **Add tests before broad UI polish.** Minimum focused tests:
   - `/goal <objective> budget: N` persists and seeds runtime state.
   - `/goal resume` injects continuation and preserves budget/accounting.
   - `create_goal` cannot silently replace an existing user goal without the expected policy.
   - `update_goal complete` and `update_goal blocked` update runtime state and visible TUI state.
   - app-server `thread/goal/set|get|clear` stays compatible.
   - continuation scheduler does not spin forever and does not mark complete without evidence.
   - user steering during a running goal is delivered to the next continuation.

9. **Document the contract.** Update `docs/MODES.md` and any runtime docs so Goal is not described as a mere mode label. It is a persisted objective and scheduler contract that Agent/Plan/YOLO/Fleet can all run under.

## Non-goals for this issue

- Do not ship `/swarm` as the main UX until this issue and the durable fanout issue are complete.
- Do not rewrite WhaleFlow here. Goal mode should be the durable objective layer that WhaleFlow/Fleet can use later.
- Do not remove `/hunt` compatibility abruptly; treat it as a compatibility alias over the unified goal bridge.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.61: Make goal mode a real persistent work loop before adding swarm UX #3215

Problem

Desired behavior

Acceptance criteria

Verification

Release note framing

Implementation plan

Non-goals for this issue

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

v0.8.61: Make goal mode a real persistent work loop before adding swarm UX #3215

Description

Problem

Desired behavior

Acceptance criteria

Verification

Release note framing

Implementation plan

Non-goals for this issue

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions