Skip to content

feat: align cargo defaults with brew + reconcile hook docs (#343)#350

Merged
mercurialsolo merged 1 commit into
mainfrom
feat/supervisor-pr1-default-features-hook-docs
Jun 9, 2026
Merged

feat: align cargo defaults with brew + reconcile hook docs (#343)#350
mercurialsolo merged 1 commit into
mainfrom
feat/supervisor-pr1-default-features-hook-docs

Conversation

@mercurialsolo

Copy link
Copy Markdown
Owner

PR1 of the supervisor RFC (#342). Closes #343.

Why

Cargo and brew users have been getting different binaries. `cargo install claudectl` shipped `["hive"]` only; the Homebrew bottle ships `["bus","coord","relay","hive"]`. Cargo users hit silent no-ops on `claudectl bus` and `claudectl coord`, and the supervisor (#342) would widen the gap. Flip defaults so both install paths produce the same ~6 MB binary.

The minimal sync-only build (~3.5 MB) survives as `cargo install --no-default-features --features hive` and is now CI-tested as the lower bound.

Hook docs reconcile

The quickstart's hook table listed only the legacy `claudectl --json` hooks written to `/.claude/settings.json`. Reality: `init` also installs the embedded plugin under `/.claude/plugins/claudectl/`, whose `hooks/hooks.json` adds brain-gate, outcome-record, session-briefing, and inbox-drain on top. The inbox-drain hook returns `decision:"block"` for the continue-in-turn delivery path documented in `docs/AGENT_BUS.md`.

The new table enumerates both — this is the consent surface the supervisor's ingest path (#345) will sit on top of.

Changes

  • `Cargo.toml` default features → `["bus","coord","relay","hive"]`.
  • `.github/workflows/ci.yml` — check/clippy/test the new default plus a `--no-default-features --features hive` matrix to keep the minimal build covered.
  • `.github/workflows/release.yml` — comment update; the explicit `--features` flag is now redundant but kept defensive.
  • `README.md` install section + per-feature blurbs simplified.
  • `CLAUDE.md` build commands + the "Minimal dependencies" invariant updated.
  • `docs/quickstart.md` hook table now lists both settings.json and plugin hooks.
  • `src/relay/mesh.rs` — collapsible_match lint that was masked by old defaults (relay wasn't in the old default set, so `cargo clippy --all-targets` never saw it). Pure cleanup.

Verification

```
cargo check --all-targets ✓
cargo check --all-targets --no-default-features --features hive ✓
cargo clippy --all-targets -- -D warnings ✓
cargo clippy --all-targets --no-default-features --features hive -- -D warnings ✓
cargo test --all-targets → 700 + 708 + 78 + 8 passed
cargo test --all-targets --no-default-features --features hive ✓
cargo build -p claudectl-core ✓
cargo build -p claudectl-tui ✓
cargo fmt --all -- --check ✓
```

Test plan

  • Confirm `cargo install claudectl` from this branch produces a binary where `claudectl bus stdio` and `claudectl coord leases` both work.
  • Confirm `cargo install claudectl --no-default-features --features hive` still produces a ~3.5 MB sync-only build.
  • Skim `docs/quickstart.md` § "What gets added" — both hook tables make sense end-to-end.

🤖 Generated with Claude Code

)

PR1 of the supervisor RFC (#342). Two related quickstart fixes that
unblock everything else in the epic.

## Default features

`cargo install claudectl` shipped `["hive"]` only; the Homebrew bottle
ships `["bus","coord","relay","hive"]`. So cargo users hit silent
no-ops on `claudectl bus`, `claudectl coord`, and (under #345 onward)
the supervisor. Flip defaults to match brew — same ~6 MB binary from
either install path.

The minimal sync-only build path (~3.5 MB) survives as
`--no-default-features --features hive` and is now CI-tested as the
lower bound (previously the default).

Knock-on: `cargo clippy --all-targets` under new defaults caught a
pre-existing `collapsible_match` lint in `relay/mesh.rs` that the
old defaults never exercised. Fixed inline — pure cleanup, no
behavior change.

## Hook docs

The quickstart's hook table listed only the legacy `claudectl --json`
hooks written to `~/.claude/settings.json`. Reality: `init` also
installs the embedded plugin under `~/.claude/plugins/claudectl/`,
whose `hooks/hooks.json` adds brain-gate, outcome-record,
session-briefing, and inbox-drain on top. The inbox-drain hook
returns `decision:"block"` for the Trigger A continue-in-turn path
documented in `docs/AGENT_BUS.md`.

Both sets coexist; Claude Code merges them. The table now enumerates
both, with the actual installed commands and what each does. This is
the consent surface for the supervisor's future hook consumer
(#345 ingest path) — a skeptical reader needs to see the full
picture before that lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mercurialsolo mercurialsolo merged commit c0f1c4e into main Jun 9, 2026
8 checks passed
mercurialsolo added a commit that referenced this pull request Jun 9, 2026
…a-version gate (#345)

PR3 of the supervisor RFC (#342). The foundation everything else (#346/#347/#348/#349) sits on. Closes #345.

## Schema v2 (coord.db)

`migrate()` adds five tables, all idempotent (`CREATE TABLE IF NOT EXISTS`):

- `tasks`           — desired state per task. Cattle-vs-pets: the task,
                      not the session, carries identity, budget, attempts.
- `task_attempts`   — one row per attempt; carries cwd_hash for resume
                      drift detection (#348).
- `task_verifications` — verifier verdicts (PR5 / #347 consumes these).
- `task_transitions`   — append-only ledger; crash recovery is "replay
                      from coord.db" — write state + transition in one
                      tx so the invariant "every state has a cause" holds.
- `hook_events`     — low-latency signal path (RFC v2 §6). Named
                      distinctly from the pre-existing coord `events`
                      table to avoid confusion.

Migrations ride the existing `init --upgrade` path (#340's
`upgrade_db_migrations`) — no new install hop for users on PR1 (#350).

## Schema-version gate (RFC v2 §11, §12)

New `EXPECTED_COORD_SCHEMA_VERSION = 2` const + `check_schema_version()`.
The gate guards the manual-upgrade gap RFC v2 §12 calls out — `brew
upgrade` without a follow-up `init --upgrade` would land a newer binary
that bumps `user_version`, then if the user runs an older binary it
would silently corrupt rows it doesn't understand. The gate refuses
loudly instead, with the exact `init --upgrade` remediation in the error.

Behavior:
- `migrate()` always brings forward; it never down-migrates.
- `user_version` is stamped AFTER tables are created so a partial
  migration never gets the version.
- The gate refuses only overrun (DB ahead of binary). Equal-or-behind
  is the normal flow — migrate handled it.

Tested:
- triple `migrate()` is idempotent
- `user_version` lands at EXPECTED after migrate
- gate refuses simulated `EXPECTED+1` DB with "init --upgrade" in message
- gate accepts equal-or-lower
- all five v2 tables exist after migrate

## `claudectl ingest --hook <name>` (RFC v2 §6)

New top-level subcommand. Reads JSON from stdin, appends to
`hook_events`. The bash hooks will move to
`claudectl ingest --hook PostToolUse 2>/dev/null || true` (separate
follow-up so the embedded plugin's hook scripts can roll out via
`init --upgrade` without breaking older binaries mid-upgrade).

Division of labor preserved:
- **JSONL tail + ps** stay authoritative (source of record).
- **`hook_events`** is latency-only. `|| true` keeps Claude Code
  unblockable, which is exactly why this table cannot be authoritative.

Six accepted hook names; anything else returns exit 1 to keep typos out
of the table. Empty stdin is a no-op (some hook invocations send
nothing). Field aliases (`session_id`/`sessionId`, `tool_name`/`toolName`)
absorb Claude Code schema variance.

## Reconciler skeleton (`coord::supervisor`)

Pure `tick(&conn, &sensors) -> Vec<Action>`. Actuator is intentionally
absent in this PR — it lands in PR4 (#346) along with bus-mailbox
assignment. The split is what makes the reconciler testable without
spinning up real sessions and makes "kill -9 mid-run → restart
converges" a property we can write down.

`Action` enum spans the full forward surface — AssignViaMailbox / Spawn /
RunVerifier / WriteSessionPolicy / ClearSessionPolicy / EscalateHuman —
so PR4–PR6 add behavior without churning the type. `Sensors` trait keeps
JSONL/ps/hook-events stitching at the trait edge.

`ObservedStatus::Unknown` is load-bearing: tested that tick() emits no
actions when a session is in Unknown state. RFC v2 §6's "no actuation"
backstop.

Wired into `run_headless()` with a `NoopSensors` stub today. The real
sensor lands when the actuator lands (PR4); for now the skeleton runs
against the live coord DB on every tick so any panic / lock issue
surfaces in headless before behavior goes hot.

## Session-policy file contract (RFC v2 §8)

New `coord::session_policy` module. `~/.claudectl/coord/session-policy/
<session_id>.json`. Atomic write via sibling tempfile + rename. Three
contracts:

1. **Atomic write** — partial files cannot be observed by the brain-gate
   hook.
2. **Tighten-only** — file's only valid effect is *more* manual
   approval. Missing/unreadable/malformed degrades to `Inherit`, i.e.
   today's behavior. Fail-open is what makes the manual-upgrade gap
   non-breaking.
3. **Lifetime-bound** — written on ASSIGNED/RUNNING, deleted on any
   terminal state. PR4 wires the actuator side; the brain-gate hook
   side (single `fs::read_to_string` per tool call) is the follow-up
   plugin-asset bump.

Six tests cover round-trip, missing-file fail-open, malformed-JSON
fail-open, atomic-write semantics (no leftover tempfile), and
delete-missing-is-not-an-error.

## Doctor rows

- `coord schema` — opens the coord DB; Pass when version matches binary,
  Fail with `init --upgrade` hint when the schema-version gate rejects.
- `coord session-policy dir` — writability probe via sentinel file;
  Advisory when the dir doesn't exist yet (fresh install before
  supervisor has assigned anything).

## Verification

- `cargo check --all-targets`                            ✓
- `cargo check --all-targets --no-default-features --features hive` ✓
- `cargo clippy` both feature sets, -D warnings          ✓
- `cargo fmt --all -- --check`                           ✓
- `cargo test --all-targets`         727 + 738 + 78 + 8 = 1551 pass
- All new tests live alongside their module, no integration scaffolding.

Pre-existing flake: `brain::sequences::tests::save_and_load_roundtrip`
mutates global `HOME` and races with parallel test execution. Surfaced
during this PR's verification but unchanged by it; fix is a separate
issue.

## Out of scope (next PRs)

- Real `Sensors` impl reading JSONL/ps/hook_events (PR4).
- Actuator (mailbox writes, spawn, verifier dispatch, policy file
  writes) — PR4/PR5.
- Updating bash hook scripts to call `claudectl ingest` — bundled with
  PR4's plugin-asset bump so the upgrade lands atomically.
- Crash-safety integration test — meaningful once the actuator writes
  state; today's reconciler is a no-op and any crash test would be
  testing nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mercurialsolo added a commit that referenced this pull request Jun 9, 2026
…a-version gate (#345) (#352)

* feat(coord): supervisor schema + reconciler skeleton + ingest + schema-version gate (#345)

PR3 of the supervisor RFC (#342). The foundation everything else (#346/#347/#348/#349) sits on. Closes #345.

## Schema v2 (coord.db)

`migrate()` adds five tables, all idempotent (`CREATE TABLE IF NOT EXISTS`):

- `tasks`           — desired state per task. Cattle-vs-pets: the task,
                      not the session, carries identity, budget, attempts.
- `task_attempts`   — one row per attempt; carries cwd_hash for resume
                      drift detection (#348).
- `task_verifications` — verifier verdicts (PR5 / #347 consumes these).
- `task_transitions`   — append-only ledger; crash recovery is "replay
                      from coord.db" — write state + transition in one
                      tx so the invariant "every state has a cause" holds.
- `hook_events`     — low-latency signal path (RFC v2 §6). Named
                      distinctly from the pre-existing coord `events`
                      table to avoid confusion.

Migrations ride the existing `init --upgrade` path (#340's
`upgrade_db_migrations`) — no new install hop for users on PR1 (#350).

## Schema-version gate (RFC v2 §11, §12)

New `EXPECTED_COORD_SCHEMA_VERSION = 2` const + `check_schema_version()`.
The gate guards the manual-upgrade gap RFC v2 §12 calls out — `brew
upgrade` without a follow-up `init --upgrade` would land a newer binary
that bumps `user_version`, then if the user runs an older binary it
would silently corrupt rows it doesn't understand. The gate refuses
loudly instead, with the exact `init --upgrade` remediation in the error.

Behavior:
- `migrate()` always brings forward; it never down-migrates.
- `user_version` is stamped AFTER tables are created so a partial
  migration never gets the version.
- The gate refuses only overrun (DB ahead of binary). Equal-or-behind
  is the normal flow — migrate handled it.

Tested:
- triple `migrate()` is idempotent
- `user_version` lands at EXPECTED after migrate
- gate refuses simulated `EXPECTED+1` DB with "init --upgrade" in message
- gate accepts equal-or-lower
- all five v2 tables exist after migrate

## `claudectl ingest --hook <name>` (RFC v2 §6)

New top-level subcommand. Reads JSON from stdin, appends to
`hook_events`. The bash hooks will move to
`claudectl ingest --hook PostToolUse 2>/dev/null || true` (separate
follow-up so the embedded plugin's hook scripts can roll out via
`init --upgrade` without breaking older binaries mid-upgrade).

Division of labor preserved:
- **JSONL tail + ps** stay authoritative (source of record).
- **`hook_events`** is latency-only. `|| true` keeps Claude Code
  unblockable, which is exactly why this table cannot be authoritative.

Six accepted hook names; anything else returns exit 1 to keep typos out
of the table. Empty stdin is a no-op (some hook invocations send
nothing). Field aliases (`session_id`/`sessionId`, `tool_name`/`toolName`)
absorb Claude Code schema variance.

## Reconciler skeleton (`coord::supervisor`)

Pure `tick(&conn, &sensors) -> Vec<Action>`. Actuator is intentionally
absent in this PR — it lands in PR4 (#346) along with bus-mailbox
assignment. The split is what makes the reconciler testable without
spinning up real sessions and makes "kill -9 mid-run → restart
converges" a property we can write down.

`Action` enum spans the full forward surface — AssignViaMailbox / Spawn /
RunVerifier / WriteSessionPolicy / ClearSessionPolicy / EscalateHuman —
so PR4–PR6 add behavior without churning the type. `Sensors` trait keeps
JSONL/ps/hook-events stitching at the trait edge.

`ObservedStatus::Unknown` is load-bearing: tested that tick() emits no
actions when a session is in Unknown state. RFC v2 §6's "no actuation"
backstop.

Wired into `run_headless()` with a `NoopSensors` stub today. The real
sensor lands when the actuator lands (PR4); for now the skeleton runs
against the live coord DB on every tick so any panic / lock issue
surfaces in headless before behavior goes hot.

## Session-policy file contract (RFC v2 §8)

New `coord::session_policy` module. `~/.claudectl/coord/session-policy/
<session_id>.json`. Atomic write via sibling tempfile + rename. Three
contracts:

1. **Atomic write** — partial files cannot be observed by the brain-gate
   hook.
2. **Tighten-only** — file's only valid effect is *more* manual
   approval. Missing/unreadable/malformed degrades to `Inherit`, i.e.
   today's behavior. Fail-open is what makes the manual-upgrade gap
   non-breaking.
3. **Lifetime-bound** — written on ASSIGNED/RUNNING, deleted on any
   terminal state. PR4 wires the actuator side; the brain-gate hook
   side (single `fs::read_to_string` per tool call) is the follow-up
   plugin-asset bump.

Six tests cover round-trip, missing-file fail-open, malformed-JSON
fail-open, atomic-write semantics (no leftover tempfile), and
delete-missing-is-not-an-error.

## Doctor rows

- `coord schema` — opens the coord DB; Pass when version matches binary,
  Fail with `init --upgrade` hint when the schema-version gate rejects.
- `coord session-policy dir` — writability probe via sentinel file;
  Advisory when the dir doesn't exist yet (fresh install before
  supervisor has assigned anything).

## Verification

- `cargo check --all-targets`                            ✓
- `cargo check --all-targets --no-default-features --features hive` ✓
- `cargo clippy` both feature sets, -D warnings          ✓
- `cargo fmt --all -- --check`                           ✓
- `cargo test --all-targets`         727 + 738 + 78 + 8 = 1551 pass
- All new tests live alongside their module, no integration scaffolding.

Pre-existing flake: `brain::sequences::tests::save_and_load_roundtrip`
mutates global `HOME` and races with parallel test execution. Surfaced
during this PR's verification but unchanged by it; fix is a separate
issue.

## Out of scope (next PRs)

- Real `Sensors` impl reading JSONL/ps/hook_events (PR4).
- Actuator (mailbox writes, spawn, verifier dispatch, policy file
  writes) — PR4/PR5.
- Updating bash hook scripts to call `claudectl ingest` — bundled with
  PR4's plugin-asset bump so the upgrade lands atomically.
- Crash-safety integration test — meaningful once the actuator writes
  state; today's reconciler is a no-op and any crash test would be
  testing nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: kick CI on rebased branch (#352)

Re-trigger CI after the rebase + retarget to main. CI's pull_request
filter only fires for the synchronize event against the new base, and
the rebase fired before the retarget — leaving the PR without checks.
No code change.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mercurialsolo added a commit that referenced this pull request Jun 10, 2026
The supervisor RFC (#342) shipped across PRs #350#356 but the
user-facing docs were never updated. A reader of the README right now
sees "the supervisor for long-horizon role persistence" listed as
"not yet built", which is exactly backwards.

## Changes

- **README.md** — Fix the stale "not yet built" claim (flow guards and
  the supervisor are both shipped now) and add a new "## Supervisor"
  section between "Agent Bus" and "Hive Mind". Covers: the design
  argument (parallel runners commoditized, durable supervision is the
  unowned layer); the CLI surface with inline + TOML examples; the
  three-verifier (`run` / `brain` / `agent`) `tasks.toml` shape from
  the RFC; the three load-bearing properties (cattle vs. pets,
  crash-safe from coord.db, fail-closed verifiers); the Prometheus
  exporter metric names.
- **docs/AGENT_BUS.md** — Implementation status table. Phase 6
  (content validation) and Phase 8 (flow guards) marked Shipped with
  the PR references. Phase 10 (Supervisor) now Shipped, with the
  module list pointing at every coord/* file the work landed in.
  Phase 9 (managed-artifact lifecycle) corrected from Not Started to
  Partial since `init --upgrade` covers it.
- **docs/reference.md** — New "Supervisor" command table (run /
  submit / status / logs / cancel / drain / undrain) and a new
  "Ingest" row for the hook signal path. Schema-version-gate
  behavior called out so readers don't hit the refusal in the wild
  with no context. Also drops the "Requires --features coord" note
  since coord is now in the default feature set.
- **docs/quickstart.md** — New "Submit a task to the supervisor"
  optional section, parked next to the existing brain / insights
  optional sections. Points at the README for the full design.
- **CLAUDE.md** — Architecture section gets a new Supervisor block
  listing every file in `src/coord/` and `src/ingest.rs`. The
  schema-version gate (`EXPECTED_COORD_SCHEMA_VERSION = 3`) is
  documented so a future reader knows where to look when migration
  drifts.

## What's deliberately out of scope

- `docs/configuration.md` doesn't get a supervisor section yet —
  `~/.claudectl/coord/policy.toml` isn't being loaded by the
  reconciler yet (the type lives in `supervisor.rs::Policy` with
  defaults). When config loading lands, the TOML schema lands with it.
- `docs/AGENT_BUS.md` §13 (Longevity / supervisor design discussion)
  isn't reworked. That whole section is the RFC narrative; the
  status table at the top is what readers check for "is this real
  yet" and that's now accurate.

No code changes. Existing tests still pass; `cargo fmt --check` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mercurialsolo added a commit that referenced this pull request Jun 10, 2026
…or (#357)

The supervisor RFC (#342) shipped across PRs #350#356 but the
user-facing docs were never updated. A reader of the README right now
sees "the supervisor for long-horizon role persistence" listed as
"not yet built", which is exactly backwards.

## Changes

- **README.md** — Fix the stale "not yet built" claim (flow guards and
  the supervisor are both shipped now) and add a new "## Supervisor"
  section between "Agent Bus" and "Hive Mind". Covers: the design
  argument (parallel runners commoditized, durable supervision is the
  unowned layer); the CLI surface with inline + TOML examples; the
  three-verifier (`run` / `brain` / `agent`) `tasks.toml` shape from
  the RFC; the three load-bearing properties (cattle vs. pets,
  crash-safe from coord.db, fail-closed verifiers); the Prometheus
  exporter metric names.
- **docs/AGENT_BUS.md** — Implementation status table. Phase 6
  (content validation) and Phase 8 (flow guards) marked Shipped with
  the PR references. Phase 10 (Supervisor) now Shipped, with the
  module list pointing at every coord/* file the work landed in.
  Phase 9 (managed-artifact lifecycle) corrected from Not Started to
  Partial since `init --upgrade` covers it.
- **docs/reference.md** — New "Supervisor" command table (run /
  submit / status / logs / cancel / drain / undrain) and a new
  "Ingest" row for the hook signal path. Schema-version-gate
  behavior called out so readers don't hit the refusal in the wild
  with no context. Also drops the "Requires --features coord" note
  since coord is now in the default feature set.
- **docs/quickstart.md** — New "Submit a task to the supervisor"
  optional section, parked next to the existing brain / insights
  optional sections. Points at the README for the full design.
- **CLAUDE.md** — Architecture section gets a new Supervisor block
  listing every file in `src/coord/` and `src/ingest.rs`. The
  schema-version gate (`EXPECTED_COORD_SCHEMA_VERSION = 3`) is
  documented so a future reader knows where to look when migration
  drifts.

## What's deliberately out of scope

- `docs/configuration.md` doesn't get a supervisor section yet —
  `~/.claudectl/coord/policy.toml` isn't being loaded by the
  reconciler yet (the type lives in `supervisor.rs::Policy` with
  defaults). When config loading lands, the TOML schema lands with it.
- `docs/AGENT_BUS.md` §13 (Longevity / supervisor design discussion)
  isn't reworked. That whole section is the RFC narrative; the
  status table at the top is what readers check for "is this real
  yet" and that's now accurate.

No code changes. Existing tests still pass; `cargo fmt --check` clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

supervisor M0/PR1: align cargo default features with brew + reconcile hook docs

1 participant