You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current goal-mode surface does not yet behave like a durable, first-class work loop. It can look like a label or prompt hint while the actual turn still behaves like a normal one-shot interaction. Before adding more fanout UX on top, goal mode needs to own the objective, progress accounting, continuation decisions, and user steering contract.
This sharpens older broad trackers like #891, #1976, and #2058 into a v0.8.61 release blocker.
Desired behavior
Goal mode should mean: there is a persisted objective, visible progress, a continuation scheduler, stop/complete/blocked semantics, and a reliable way for Hunter to steer the run while it is active. It should not depend on the model remembering everything in the conversation.
Acceptance criteria
/goal creates or resumes a persisted goal record with objective, status, started/updated timestamps, and optional budget/accounting fields.
The TUI status surface shows the active goal, progress, and whether the agent is continuing, waiting on tools, blocked, or complete.
User steering during a goal is queued or delivered reliably and is visible to the next continuation turn.
Goal completion/blocking is explicit and auditable, not inferred from a normal assistant message.
The agent can continue meaningful work across turns without re-reading the whole prompt stack every time.
Add tests that exercise goal creation, resume, steering, completion, blocked state, and TUI projection. Add at least one dogfood scenario where a long release task continues after tool work without losing the active goal.
Release note framing
This is not a new visual flourish. It is the foundation that makes always-on work, fleet workers, and later swarm-style UX make sense.
Implementation plan
The code already has several partial goal systems. The v0.8.61 work is to unify them rather than adding another surface.
Current owners to inspect first:
crates/tui/src/commands/groups/project/goal.rs: /goal command, currently backed by legacy app.hunt state.
crates/tui/src/tools/goal.rs: model-visible create_goal, get_goal, and update_goal runtime tools plus continuation prompt rendering.
crates/tui/src/core/engine/turn_loop.rs: goal_continuation_message_if_needed, currently a bounded in-turn continuation prompt injection.
crates/tui/src/tui/ui.rs: apply_goal_snapshot_to_app bridges engine goal snapshots back into visible TUI state.
crates/tui/src/tui/sidebar.rs and crates/tui/src/context_report.rs: visible goal projection and context reporting.
crates/protocol/src/lib.rs, crates/state/src/lib.rs, and crates/app-server/src/lib.rs: persisted thread-goal protocol and app-server APIs already exist and need to become the source of durable truth.
Reference shape, for design comparison only: /Volumes/VIXinSSD/codex-main/codex-rs/ext/goal/src/{runtime.rs,tool.rs,steering.rs,accounting.rs}. Do not copy blindly; adapt the architecture to CodeWhale.
Concrete changes to make:
Create one goal state bridge. Add a small service/module that can translate between app.hunt, tools::goal::GoalState, protocol ThreadGoal, and state-store ThreadGoalRecord. Avoid having each caller mutate its own goal copy.
Make /goal durable. When /goal <objective> is used, persist a thread goal through the existing state/app-server path where available, seed the runtime SharedGoalState, update the visible app state, and emit the same goal-updated event used by model tool updates. /goal pause|resume|blocked|complete|clear should go through that same bridge.
Tighten model-visible tool authority. Keep create_goal as model-visible only when the user explicitly asked for a persistent objective. Keep update_goal limited to terminal statuses the model can assert (complete, blocked) while user/system controls pause/resume/budget/usage states. Mirror the stricter blocked audit already described in the goal continuation prompt.
Move continuation out of ad-hoc prompt-only behavior.goal_continuation_message_if_needed should not be the only mechanism. Add a runtime continuation scheduler that can decide: continue now, wait for tools/workers, stop because budget/usage, or surface blocked/complete. The bounded in-turn continuation can remain as a guardrail, but the durable goal loop must survive across turns.
Account progress. Track tokens/time per active goal using existing fields (token_budget, tokens_used, time_used_seconds) instead of only app.session.total_conversation_tokens. Budget-limited and usage-limited states should be explicit.
Make steering reliable. If the user sends guidance while a goal is running, queue or inject it as goal-context steering for the next continuation turn. It must not disappear behind a model call, shell job, or worker wait.
Update projections. Sidebar, context inspector, status/header/footer, and app-server notifications should all read the same effective goal state. Remove mismatches where /goal says active but the runtime tool state says none, or vice versa.
Add tests before broad UI polish. Minimum focused tests:
/goal <objective> budget: N persists and seeds runtime state.
/goal resume injects continuation and preserves budget/accounting.
create_goal cannot silently replace an existing user goal without the expected policy.
update_goal complete and update_goal blocked update runtime state and visible TUI state.
continuation scheduler does not spin forever and does not mark complete without evidence.
user steering during a running goal is delivered to the next continuation.
Document the contract. Update docs/MODES.md and any runtime docs so Goal is not described as a mere mode label. It is a persisted objective and scheduler contract that Agent/Plan/YOLO/Fleet can all run under.
Non-goals for this issue
Do not ship /swarm as the main UX until this issue and the durable fanout issue are complete.
Do not rewrite WhaleFlow here. Goal mode should be the durable objective layer that WhaleFlow/Fleet can use later.
Do not remove /hunt compatibility abruptly; treat it as a compatibility alias over the unified goal bridge.
Problem
The current goal-mode surface does not yet behave like a durable, first-class work loop. It can look like a label or prompt hint while the actual turn still behaves like a normal one-shot interaction. Before adding more fanout UX on top, goal mode needs to own the objective, progress accounting, continuation decisions, and user steering contract.
This sharpens older broad trackers like #891, #1976, and #2058 into a v0.8.61 release blocker.
Desired behavior
Goal mode should mean: there is a persisted objective, visible progress, a continuation scheduler, stop/complete/blocked semantics, and a reliable way for Hunter to steer the run while it is active. It should not depend on the model remembering everything in the conversation.
Acceptance criteria
/goalcreates or resumes a persisted goal record with objective, status, started/updated timestamps, and optional budget/accounting fields.Verification
Add tests that exercise goal creation, resume, steering, completion, blocked state, and TUI projection. Add at least one dogfood scenario where a long release task continues after tool work without losing the active goal.
Release note framing
This is not a new visual flourish. It is the foundation that makes always-on work, fleet workers, and later swarm-style UX make sense.
Implementation plan
The code already has several partial goal systems. The v0.8.61 work is to unify them rather than adding another surface.
Current owners to inspect first:
crates/tui/src/commands/groups/project/goal.rs:/goalcommand, currently backed by legacyapp.huntstate.crates/tui/src/tools/goal.rs: model-visiblecreate_goal,get_goal, andupdate_goalruntime tools plus continuation prompt rendering.crates/tui/src/core/engine/turn_loop.rs:goal_continuation_message_if_needed, currently a bounded in-turn continuation prompt injection.crates/tui/src/tui/ui.rs:apply_goal_snapshot_to_appbridges engine goal snapshots back into visible TUI state.crates/tui/src/tui/sidebar.rsandcrates/tui/src/context_report.rs: visible goal projection and context reporting.crates/protocol/src/lib.rs,crates/state/src/lib.rs, andcrates/app-server/src/lib.rs: persisted thread-goal protocol and app-server APIs already exist and need to become the source of durable truth./Volumes/VIXinSSD/codex-main/codex-rs/ext/goal/src/{runtime.rs,tool.rs,steering.rs,accounting.rs}. Do not copy blindly; adapt the architecture to CodeWhale.Concrete changes to make:
Create one goal state bridge. Add a small service/module that can translate between
app.hunt,tools::goal::GoalState, protocolThreadGoal, and state-storeThreadGoalRecord. Avoid having each caller mutate its own goal copy.Make
/goaldurable. When/goal <objective>is used, persist a thread goal through the existing state/app-server path where available, seed the runtimeSharedGoalState, update the visible app state, and emit the same goal-updated event used by model tool updates./goal pause|resume|blocked|complete|clearshould go through that same bridge.Tighten model-visible tool authority. Keep
create_goalas model-visible only when the user explicitly asked for a persistent objective. Keepupdate_goallimited to terminal statuses the model can assert (complete,blocked) while user/system controls pause/resume/budget/usage states. Mirror the stricter blocked audit already described in the goal continuation prompt.Move continuation out of ad-hoc prompt-only behavior.
goal_continuation_message_if_neededshould not be the only mechanism. Add a runtime continuation scheduler that can decide: continue now, wait for tools/workers, stop because budget/usage, or surface blocked/complete. The bounded in-turn continuation can remain as a guardrail, but the durable goal loop must survive across turns.Account progress. Track tokens/time per active goal using existing fields (
token_budget,tokens_used,time_used_seconds) instead of onlyapp.session.total_conversation_tokens. Budget-limited and usage-limited states should be explicit.Make steering reliable. If the user sends guidance while a goal is running, queue or inject it as goal-context steering for the next continuation turn. It must not disappear behind a model call, shell job, or worker wait.
Update projections. Sidebar, context inspector, status/header/footer, and app-server notifications should all read the same effective goal state. Remove mismatches where
/goalsays active but the runtime tool state says none, or vice versa.Add tests before broad UI polish. Minimum focused tests:
/goal <objective> budget: Npersists and seeds runtime state./goal resumeinjects continuation and preserves budget/accounting.create_goalcannot silently replace an existing user goal without the expected policy.update_goal completeandupdate_goal blockedupdate runtime state and visible TUI state.thread/goal/set|get|clearstays compatible.Document the contract. Update
docs/MODES.mdand any runtime docs so Goal is not described as a mere mode label. It is a persisted objective and scheduler contract that Agent/Plan/YOLO/Fleet can all run under.Non-goals for this issue
/swarmas the main UX until this issue and the durable fanout issue are complete./huntcompatibility abruptly; treat it as a compatibility alias over the unified goal bridge.