Minor Changes
-
86a62d9:
colony bridge lifecyclegains--replay <file>and--dry-runso a savedcolony-omx-lifecycle-v1envelope (e.g. captured.pre.json) can be routed offline through the real lifecycle logic without touching the live data dir. Combined with--json, this gives runtime integrators a CI-shaped harness for asserting onroute,event_type,extracted_paths, andok. -
99936fa:
colony bridge replay <file.pre.json>is now a first-class subcommand for
offline debugging of captured pre-tool-use envelopes. Default is--dry-run
(ephemeral in-memory SQLite, no side effects); pass--applyto write to
the live store. A new--rewrite-root <from>=<to>flag rewrites absolute
paths in the envelope before dispatch so captures from another machine can
be replayed locally. Reuses the existing
packages/contracts/fixtures/colony-omx-lifecycle-v1/fixtures and does not
require the worker daemon. The shell shim atapps/cli/bin/colony.sh
short-circuits onlybridge lifecycleto the daemon, sobridge replay
runs in-process automatically. -
edc318f:
colony gain --summarynow renders an rtk-style compact view over the same
mcp_metricsreceipts: headline KPI stack (total calls, input/output/total
tokens, tokens saved, total exec time), efficiency meter, top-N By
Operation table with proportional impact bars, a 30-day Daily Activity
bar graph, and a 12-day Daily Breakdown table.--graphand--daily
narrow the output to a single section;--days <n>and--top-ops <n>tune
the window and table size. Per-operation saved-token credit is distributed
across each comparison row'smatched_operationsproportionally to call share
so theSavedcolumn lines up with the headline total.Storage gains
Storage.aggregateMcpMetricsDaily({ since, until, operation })
returning per-UTC-day rollups ({ day, calls, input_tokens, output_tokens, total_tokens, total_duration_ms }) ordered newest-first. Type exports
AggregateMcpMetricsDailyOptionsandMcpMetricsDailyRowcome along. -
9f6b502: Add
colony grabcommand group: a per-project localhost intake daemon that
turns a react-grab "Add context" submit into a colony task on a fresh
agent/*worktree and starts a detachedtmuxsession runningcodex
inside it.colony grab serve— long-lived HTTP daemon on 127.0.0.1 with strict
request gating (bearer token,Originallowlist, JSON content-type,
CORS preflight). On acceptedPOST /grab, creates a colony task,
posts the react-grab payload as akind: "note"observation, writes
.colony/INTAKE.mdinto the worktree, and spawnstmux new-session -d
runningcodexin the worktree.colony grab attach <task-id>— convenience attach to the spawned
tmux sessionrg-<task-id>.colony grab status— list grab daemons known to$COLONY_HOME.
In-memory dedup (default 5 min window) keyed by
sha256(repo_root|file_path|content|extra_prompt)collapses repeat
submits intotask_postnotes on the existing task.The daemon is off by default; it must be started explicitly.
-
86a3d1a:
colony installnow auto-wires the Colony skill into the active cue profile so
the agent discovers Colony as a pullable capability (loads on a real trigger)
instead of relying on forced session prefaces. Newcolony skills wire/
colony skills unwiresubcommands shell out to cue's owncue skills add-to-profile/remove-from-profile, targeting the cue-resident
colony/colonyskill. Wiring is best-effort: a missing cue is a soft no-op that
prints the manualnpx skills addfallback (now with therecodeetypo fixed),
never a failed install. Opt out withcolony install --no-skillsor
COLONY_SKILL_WIRE=0; uninstall removes the skill symmetrically. -
a83eeea:
colony gain driftand a matchingsavings_drift_reportMCP tool flag
tools whose median tokens-per-call has drifted up or down. Default windows
are non-overlapping: recent = last 3 days, baseline = 14 days ending 3 days
before recent. Default thresholds:--threshold 1.25(up),--down-threshold 0.75,--min-calls 20per window. Classifications:up_drift,
down_drift,new_tool(no baseline),gone(no recent),insufficient_data,
stable.Storage gains
Storage.mcpTokenDriftPerOperation()which computes per-operation
medians with aROW_NUMBER() OVER (PARTITION BY operation ORDER BY tpc)
window function — chosen over the correlatedLIMIT 1 OFFSET (COUNT-1)/2
form because SQLite forbids outer aggregate references in scalar-subquery
OFFSET. AmcpMetricsMinTs()helper surfaces a one-line warning when the
baseline window starts before the first recorded metric. -
53836ff:
colony health --coachwalks a repo through first-week setup. It detects
adoption stage (fresh/installed_no_signal/early/mid_adoption)
from cheap signals (countObservations, installed-IDE flags,
firstObservationTs,Math.max(toolCallsSince, countMcpMetricsSince)),
then surfaces the NEXT incomplete step from a fixed 7-step ladder:
install_runtime→first_task_post→first_task_claim_file→
first_task_hand_off→first_plan_claim→first_quota_release→
first_gain_review. Each step carries an exactcmd:andtool:string.Progress is persisted in a new
coach_progressSQLite table (migration
014-coach-progress.ts, schema_version 13 → 14). Step completion is
event-observed viamcp_metrics/observations, never user-clicked.
colony gainrecords acoach_gain_reviewobservation so step 7 can
self-detect.--coachis mutually exclusive with--fix-planand respects
--json. -
0950b42: ICM slice 3 — observation importance + temporal decay.
Every observation now carries an
importancetier
(critical | high | medium | low, defaultmedium), a rolling
access_count, alast_accessed_attimestamp, and aweightvalue.
Critical/high pin their weight to the base value and never decay;
medium/low decay asbaseWeight / (1 + access_count * 0.1)whenever
they are read. Read paths (MemoryStore.search,getObservations,
semanticSearch) coalesce ids into a debounced 50ms batch and flush
the access bookkeeping in one transactional UPDATE, so heavy read
loops trade at most one extra write per ~50ms window.Search and
get_observationsMCP responses now includeimportance
andweighton each row (additive — older callers ignore them).
task_postaccepts an optionalimportanceparameter forwarded to
the underlying observation insert.New CLI subcommand
colony memory prunedeletes near-zero-weight
medium/low rows;--min-weight <n>overrides the default 0.1
threshold and--dry-runreports the candidate count without
deleting. Critical/high are never affected.Storage: schema bumped to version 17 with four additive columns on
observationsand two new indexes.Storage.recordAccess,
Storage.pruneLowDecay, andStorage.countLowDecayCandidatesare
the public primitives. (Originally targeted version 15 in isolation;
landed at 17 alongside slice 1 memoirs and slice 2 feedback.) -
66fa52c: Surface unpublished on-disk plan workspaces in
task_plan_list, and chainplan createintoplan publishTwo related improvements so orchestrators (and fleets of codex workers) don't waste cap on a plan they have a workspace for but never registered in Colony:
task_plan_listnow scansopenspec/plans/*and merges any disk workspace whose slug is not already registered, markedregistry_status: 'unpublished'. Workers cannot claim from these, but seeing them lets the orchestrator notice and runcolony plan publish <slug>. Passinclude_unpublished: falseto mirror the legacy registered-only behavior.colony plan createnow accepts--publish(plus optional--publish-session,--publish-agent,--publish-auto-archive) which chains into the same publish path immediately after the workspace is created, eliminating the "I created a plan but workers don't see it" failure mode.PlanInfo.registry_statusgains an'unpublished'variant.
-
61150c7: Add a
Moverssection tocolony gainthat splits the queried window into a
trailing "recent" segment and a "prior" segment, then surfaces operations whose
per-hour call rate, token rate, or error count has shifted materially between
the two. Top 3 risers (▲), top 3 fallers (▼), and top 3 error risers (!) are
listed inline above the existing Operations table. New ops (no prior activity)
are tagged(new)and disappeared ops(gone). Two new flags:--recent-hours <n>to override the split (default:window / 7) and--no-moversto
suppress the section. JSON output gains alive.moverspayload with the same
shape as the rendered rows.
Patch Changes
-
55581ed: Add
colony demo: a 60-second guided walkthrough of file-claim contention prevention. Two simulated agents (claude-codeandcodex) join the same task and try to claimsrc/api.ts; the second agent getsblocked_active_owner, thenclaude-codereleases andcodexretries successfully. The demo runs against an isolated temp data dir and cleans up on exit, with--jsonfor a structured transcript and--keep-datafor inspection. Also ship pre-baked~/.colony/settings.jsonfragments underexamples/policies/for Next.js monorepos, Python packages, and Rust workspaces — each fragment lists stack-appropriateprivacy.excludePatterns(build output, caches,.env) andprotected_files(lockfiles, root config). README points to both surfaces from the install block. -
8917c73: Fix two
colony healthscoring bugs that surfaced as "bad" readiness areas with no real defect:colony_mcp_share.mcp_tool_calls = 0despite live MCP traffic. The counter only readtool_callsrows, missing MCP traffic when the calling agent's PostToolUse hook didn't fire formcp__*tools. The counter now takes the max of that observed count andmcp_metricsrow count (colony MCP server's own per-call receipt), with the source surfaced insource_breakdown.colony_mcp_metrics. New storage helpercountMcpMetricsSince(since, until?).claim_before_edit_ratio = nullwhen any edits lacked file_path metadata. Forcing the ratio to null wheneveredit_tool_calls !== edits_with_file_pathturned a real 200/363 = 55% signal into a baren/aheadline. The ratio is now computed over measurable edits wheneveredits_with_file_path > 0; thestatusfield still communicates partial measurability for downstream consumers.
-
e52cd83: Fix
aggregateMcpMetricserror_reasons grouping so per-row counts sum to
error_count. The grouping previously partitioned by(operation, error_code, error_message), but several handlers embed unique session IDs in their error
messages (e.g.sub-task is claimed by codex-session-XYZ), so each race loss
produced a distinct group. Combined with a 3-row truncation per operation, the
result was that nearly all errors were hidden —task_plan_claim_subtaskwould
report 7 errors in the Top error reasons table while the Operations table showed
93 for the same row. Grouping now dropserror_messagefrom the key (SQLite
picks the row with the latesttsfor the sample message via its bare-column-
with-MAX optimization) and the per-operation cap is bumped from 3 to 8 since
codes are low-cardinality. Sum-of-reasons now matches error_count exactly. -
9376314: Stop mislabelling generic MCP errors and reduce SQLite contention failures.
mcpErrornow codes non-TaskThreadErrorthrows asINTERNAL_ERRORinstead ofOBSERVATION_NOT_ON_TASK, so validation failures and SQLite "database is locked" errors surface honestly inmcp_metricsandcolony gain.- Storage now sets
PRAGMA busy_timeout = 5000on every connection (worker daemon, MCP server, CLI hooks all open separate handles to the same WAL DB), so concurrent writers wait the kernel out instead of throwingSQLITE_BUSYimmediately.
-
5efdb52: Native Windows support for the
colonyCLI. The bin entry was a POSIX shell
script (bin/colony.sh) that npm could not execute on Windows without WSL,
breaking every Windows install of the package. The shim is now a Node ES
module (bin/colony.mjs) using onlynode:*builtins, so npm's generated
.cmd/.ps1wrappers run it natively under cmd, PowerShell, and Git Bash.The daemon fast-path for
colony bridge lifecycle --jsonis preserved — the
HTTP POST to127.0.0.1:$COLONY_WORKER_PORT/api/bridge/lifecyclenow goes
throughnode:http, with anode:netconnect probe (1s) before the request
(2s) so the fallback latency stays close to the curl-based version when the
daemon isn't running. Stdin is buffered and replayed on fallback, preserving
rule #10 (a dead daemon must never lose or block a write).CI now runs the build matrix on
ubuntu-latest,macos-latest, and
windows-latestacross Node 20 and 22 so this regression cannot recur. -
1d78c99: Push awareness: when an edit touches a file another live session holds, the PostToolUse hook now injects a one-line
[Colony] session X recently claimed <file> …note into the agent's context immediately (Claude CodeadditionalContext), instead of waiting for the next turn's preface. Debounced to once per 2 minutes per session via anawareness-pushobservation marker (hook processes are one-shot, so the marker doubles as audit trail).autoClaimFromToolUsenow reports live-owner blocked takeovers in itsconflictsresult. -
08482b1: Surface hot-loop dominance and drop double-"saved" labels in
colony gain.
Top spend now reports the operation's share of total tokens, and aHot loop:
callout fires when one operation owns ≥70% of token spend across ≥100 calls.
The "Saved:" / "USD saved:" labels are renamed to "Net:" / "Net USD:" so the
phrase no longer reads "Saved: X saved", and the live sessions header drops the
trailing, -when cost isn't configured. -
658b722: Raise SQLite contention headroom in
@colony/storageso the worker daemon,
MCP server, CLI hooks, and codex-fleet panes can share~/.colony/data.db
without surfacingSQLITE_BUSY: database is lockedto callers. The
Storageconstructor now setsbusy_timeout=15000(was 5000), and
withBusyRetrydefaults bump to 8 attempts with up-to-1s backoff (was 5
attempts / 250ms cap). Happy-path callers are unaffected because no busy
error still means no retry sleep; sustained contention from ~30+ concurrent
writers — the codex-fleet shape that triggered this — now has ~3.85s of
combined SQLite + Node retry headroom before raising.