Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/startup-panel-one-call.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@colony/mcp-server': minor
---

One-call startup + registration-cost telemetry. `startup_panel` now carries `compact_hivemind` (lane map), `attention_summary` ({unread, blocking, pending_handoffs}), and `tool_profile` (lean/full, so agents know whether to restart with COLONY_TOOL_PROFILE=full for plan/spec/memoir tools) — AGENTS.md blesses it as THE startup call, with the legacy 4-call sweep deprecated but working. `savings_report` gains `registration_cost` ({profile, tool_count, name_description_tokens}) so the per-session schema-injection cost is observable.
18 changes: 12 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,18 @@ Promote to full Guardex / OMX orchestration only when scope grows into:

Use Colony as the primary coordination surface.

On every startup, resume, follow-up, or "continue" request, run this order:
On every startup, resume, follow-up, or "continue" request, make ONE call:

1. `mcp__colony__hivemind_context`
2. `mcp__colony__attention_inbox`
3. `mcp__colony__task_ready_for_agent`
4. `mcp__colony__search` only when prior decisions, earlier lanes, file history, or error context matter.
1. `mcp__colony__startup_panel` — returns the active task, compact lane map (who else is working — what `mcp__colony__hivemind_context` used to provide), attention summary, blocking items, ready work, claims, blocker/next/evidence, and the exact next MCP call.

Escalate only when the panel says so:

- `attention_summary.blocking` is true or `blocking_items` is non-empty → `mcp__colony__attention_inbox`.
- Picking new work and `ready_task` is null → `mcp__colony__task_ready_for_agent`.
- Prior decisions, earlier lanes, file history, or error context matter → `mcp__colony__search`.
- `tool_profile` is `lean` and you need plan/spec/memoir tools → restart the MCP server with `COLONY_TOOL_PROFILE=full`.

The legacy 4-call sweep (`mcp__colony__hivemind_context` → `mcp__colony__attention_inbox` → `mcp__colony__task_ready_for_agent` → `mcp__colony__search`) still works but spends ~3 extra calls of context per session; prefer the panel.

Rules:

Expand All @@ -85,7 +91,7 @@ Rules:
Fallback:

- Colony is considered unavailable only when the MCP namespace is missing, the tool call fails, or the installed Colony server does not expose the required tool.
- If `attention_inbox` or `task_ready_for_agent` is missing, fall back to `hivemind_context`, then `task_list`, then hydrate only the relevant task IDs.
- If `startup_panel` is missing, fall back to the legacy sweep (`hivemind_context` → `attention_inbox` → `task_ready_for_agent`); if those are missing too, `task_list`, then hydrate only the relevant task IDs.
- Do not skip Colony just because OMX state exists. OMX is fallback, not the first coordination source.
- Read `.omx/state` and `.omx/notepad.md` only when Colony is unavailable, missing the needed state, or the task explicitly depends on legacy OMX state.
- Keep `.omx/notepad.md` lean: live handoffs only.
Expand Down
25 changes: 22 additions & 3 deletions apps/mcp-server/src/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import { readSync } from 'node:fs';
import { join } from 'node:path';
import { PassThrough, type Readable } from 'node:stream';
import { countTokens } from '@colony/compress';
import { type Settings, loadSettings, resolveDataDir } from '@colony/config';
import { type Embedder, MemoryStore } from '@colony/core';
import { createEmbedder } from '@colony/embedding';
Expand Down Expand Up @@ -39,7 +40,12 @@ import * as spec from './tools/spec.js';
import * as startupPanel from './tools/startup-panel.js';
import * as suggest from './tools/suggest.js';
import * as task from './tools/task.js';
import { LEAN_TOOLS, gateToolRegistration, resolveToolProfile } from './tools/tool-profile.js';
import {
LEAN_TOOLS,
type ToolRegistrationStats,
gateToolRegistration,
resolveToolProfile,
} from './tools/tool-profile.js';

export { buildBridgeStatusPayload } from './tools/bridge.js';
export type { BridgeStatus, BridgeStatusOptions } from './tools/bridge.js';
Expand Down Expand Up @@ -71,8 +77,20 @@ export function buildServer(
// COLONY_TOOL_PROFILE=full (or settings.mcp.toolProfile) restores the
// whole surface for plan/spec/queen lanes.
const toolProfile = options.toolProfile ?? resolveToolProfile(settings);
const registrar =
toolProfile === 'lean' ? gateToolRegistration(server, (name) => LEAN_TOOLS.has(name)) : server;
const registrationStats: ToolRegistrationStats = {
profile: toolProfile,
tool_count: 0,
name_description_tokens: 0,
};
const recordRegistration = (name: string, description: string): void => {
registrationStats.tool_count += 1;
registrationStats.name_description_tokens += countTokens(`${name} ${description}`);
};
const registrar = gateToolRegistration(
server,
toolProfile === 'lean' ? (name) => LEAN_TOOLS.has(name) : () => true,
recordRegistration,
);

// Make this MCP client visible to hivemind even when the IDE never ran
// colony's lifecycle hooks (codex, custom MCP clients, background tools).
Expand Down Expand Up @@ -102,6 +120,7 @@ export function buildServer(
store,
settings,
toolProfile,
registrationStats,
...(options.planValidation !== undefined ? { planValidation: options.planValidation } : {}),
resolveEmbedder,
// Heartbeat outer touches the active-session row before the handler runs;
Expand Down
2 changes: 2 additions & 0 deletions apps/mcp-server/src/tools/context.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ export interface ToolContext {
settings: Settings;
/** Active MCP tool surface. Absent means 'full' (direct register() callers in tests). */
toolProfile?: McpToolProfile;
/** Registration-cost telemetry filled while buildServer registers tools. */
registrationStats?: import('./tool-profile.js').ToolRegistrationStats;
planValidation?: PlanValidationRuntime;
/**
* Lazy-singleton embedder. Returns null when the provider is `none` or the
Expand Down
10 changes: 10 additions & 0 deletions apps/mcp-server/src/tools/savings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -130,13 +130,22 @@ export function register(server: McpServer, ctx: ToolContext): void {
session_summary: live.session_summary,
sessions: live.sessions,
};
const registration_cost = ctx.registrationStats
? {
profile: ctx.registrationStats.profile,
tool_count: ctx.registrationStats.tool_count,
name_description_tokens: ctx.registrationStats.name_description_tokens,
note: 'Per-session schema-injection cost basis: name+description tokens only. Schema-inclusive budgets are enforced by apps/mcp-server/test/tool-budget.test.ts (lean <=4200, full <=15000).',
}
: null;
if (honest === true) {
return {
content: [
{
type: 'text',
text: JSON.stringify({
mode: 'honest_live_receipts',
registration_cost,
live: livePayload,
}),
},
Expand All @@ -148,6 +157,7 @@ export function register(server: McpServer, ctx: ToolContext): void {
{
type: 'text',
text: JSON.stringify({
registration_cost,
live: livePayload,
comparison,
comparison_cost: live.cost_basis.configured
Expand Down
29 changes: 29 additions & 0 deletions apps/mcp-server/src/tools/startup-panel.ts
Original file line number Diff line number Diff line change
Expand Up @@ -81,11 +81,23 @@ interface StartupWarning {
next_args?: Record<string, unknown>;
}

interface StartupLane {
agent: string;
branch: string;
activity: HivemindSession['activity'];
task: string;
}

interface StartupPanel {
session_id: string;
agent: string;
/** Active MCP tool surface; lean callers needing plan/spec/memoir tools restart with COLONY_TOOL_PROFILE=full. */
tool_profile: 'lean' | 'full';
repo_root: string | null;
branch: string | null;
/** Compact lane map so startup_panel alone answers "who else is active". */
compact_hivemind: { lane_count: number; lanes: StartupLane[] };
attention_summary: { unread: number; blocking: boolean; pending_handoffs: number };
active_task: StartupPanelTask | null;
ready_task: StartupReadyTask | null;
active_queen_plan: StartupQueenPlan | null;
Expand Down Expand Up @@ -125,6 +137,7 @@ export function register(server: McpServer, ctx: ToolContext): void {
const panel = await buildStartupPanel(store, {
session_id,
agent,
tool_profile: ctx.toolProfile ?? 'full',
...(repo_root !== undefined ? { repo_root } : {}),
...(branch !== undefined ? { branch } : {}),
ready_limit: ready_limit ?? DEFAULT_READY_LIMIT,
Expand All @@ -141,6 +154,7 @@ export async function buildStartupPanel(
args: {
session_id: string;
agent: string;
tool_profile?: 'lean' | 'full';
repo_root?: string;
branch?: string;
ready_limit?: number;
Expand Down Expand Up @@ -194,8 +208,23 @@ export async function buildStartupPanel(
return {
session_id: args.session_id,
agent: args.agent,
tool_profile: args.tool_profile ?? 'full',
repo_root: scopedRepoRoot,
branch: activeBranch,
compact_hivemind: {
lane_count: snapshot.session_count,
lanes: snapshot.sessions.map((lane) => ({
agent: lane.agent,
branch: lane.branch,
activity: lane.activity,
task: lane.task.replace(/[\r\n]+/g, ' ').slice(0, 80),
})),
},
attention_summary: {
unread: inbox.summary.unread_message_count,
blocking: inbox.summary.blocked,
pending_handoffs: inbox.summary.pending_handoff_count,
},
active_task: activeTask ? compactTask(activeTask) : null,
ready_task: compactReadyTask(ready.ready[0] ?? null),
active_queen_plan: activeQueenPlan,
Expand Down
18 changes: 18 additions & 0 deletions apps/mcp-server/src/tools/tool-profile.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,17 @@ export function resolveToolProfile(
export function gateToolRegistration(
server: McpServer,
allow: (name: string) => boolean,
onRegister?: (name: string, description: string) => void,
): McpServer {
return new Proxy(server, {
get(target, prop, _receiver) {
if (prop === 'tool') {
return (...args: unknown[]) => {
const name = args[0];
if (typeof name === 'string' && !allow(name)) return undefined;
if (typeof name === 'string' && onRegister) {
onRegister(name, typeof args[1] === 'string' ? args[1] : '');
}
return (target.tool as (...a: unknown[]) => unknown).apply(target, args);
};
}
Expand All @@ -76,3 +80,17 @@ export function gateToolRegistration(
},
}) as McpServer;
}

/**
* Registration-cost telemetry captured while tools register. Token figure
* covers name + description only — input schemas are zod shapes here and only
* become countable JSON schema at listTools time; the schema-inclusive budget
* lives in apps/mcp-server/test/tool-budget.test.ts. Tools registered via the
* SDK's schema-first overload (no description string) count name-only — a
* known undercount, acceptable for trend telemetry.
*/
export interface ToolRegistrationStats {
profile: McpToolProfile;
tool_count: number;
name_description_tokens: number;
}
10 changes: 10 additions & 0 deletions apps/mcp-server/test/server.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,16 @@ describe('MCP server', () => {
} | null;
};

const withRegistration = JSON.parse(text) as {
registration_cost: {
profile: string;
tool_count: number;
name_description_tokens: number;
} | null;
};
expect(withRegistration.registration_cost).toMatchObject({ profile: 'full' });
expect(withRegistration.registration_cost?.tool_count).toBeGreaterThan(70);
expect(withRegistration.registration_cost?.name_description_tokens).toBeGreaterThan(1000);
expect(payload.live.cost_basis.configured).toBe(true);
expect(payload.live.totals.total_cost_usd).toBeCloseTo(0.005, 12);
expect(payload.live.totals.avg_cost_usd).toBeCloseTo(0.005, 12);
Expand Down
16 changes: 16 additions & 0 deletions apps/mcp-server/test/startup-panel.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@ let client: Client;

interface StartupPanel {
session_id: string;
tool_profile: 'lean' | 'full';
compact_hivemind: {
lane_count: number;
lanes: Array<{ agent: string; branch: string; activity: string; task: string }>;
};
attention_summary: { unread: number; blocking: boolean; pending_handoffs: number };
repo_root: string | null;
branch: string | null;
active_task: { id: number; title: string; branch: string } | null;
Expand Down Expand Up @@ -116,6 +122,16 @@ describe('startup_panel', () => {
recommended_next_tool: 'task_ready_for_agent',
});
expect(panel.copy_paste_next_mcp_calls[0]).toContain('mcp__colony__task_ready_for_agent');
// One-call startup additions: the panel alone answers "who else is
// active", "anything blocking", and "which tool surface am I on".
expect(panel.tool_profile).toBe('full');
expect(panel.compact_hivemind).toMatchObject({ lane_count: expect.any(Number) });
expect(Array.isArray(panel.compact_hivemind.lanes)).toBe(true);
expect(panel.attention_summary).toEqual({
unread: 0,
blocking: false,
pending_handoffs: 0,
});
});

it('summarizes an active task with blocker, next step, evidence, and claims', async () => {
Expand Down
14 changes: 9 additions & 5 deletions docs/mcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,17 @@ Use `list_sessions` -> `timeline` when you need to navigate a known session inst

## Agent startup loop

Agent startup, resume, "what needs me?", and "what should I do next?" flows should call these first:
Agent startup, resume, "what needs me?", and "what should I do next?" flows make ONE call:

1. `hivemind_context` to see active agents, owned branches, live lanes, compact memory hits, and relevant negative warnings.
2. `attention_inbox` to see what needs your attention: handoffs, messages, wakes, stalled lanes, fresh claims, stale-claim cleanup signals, and decaying hot files.
3. `task_ready_for_agent` to choose available work matched to the current agent.
1. `startup_panel` — active task, compact lane map (`compact_hivemind`), `attention_summary`, blocking items, ready work, claims, blocker/next/evidence, `tool_profile`, and the exact next MCP call.

Do not choose work before attention_inbox.
Escalate only when the panel says so:

1. `hivemind_context` when you need full progressive-disclosure lane detail, compact memory hits, or negative warnings beyond the panel's lane map.
2. `attention_inbox` when `attention_summary.blocking` is true or `blocking_items` is non-empty: handoffs, messages, wakes, stalled lanes, fresh claims, stale-claim cleanup signals, and decaying hot files.
3. `task_ready_for_agent` to choose available work when the panel's `ready_task` is null.

Do not choose work before checking blocking items (panel or attention_inbox).

Codex-style MCP tool names include the server prefix:
`mcp__colony__hivemind_context`,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-12
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# startup_panel one-call startup + registration cost telemetry (T1)

Why: AGENTS.md mandated a 4-call startup sweep; startup_panel already existed but lacked the lane map, attention summary, and profile hint needed to stand alone. The lean-profile work (#588) also left registration cost unmeasured at runtime.

What:
- startup_panel payload gains compact_hivemind {lane_count, lanes[{agent,branch,activity,task}]}, attention_summary {unread, blocking, pending_handoffs}, tool_profile.
- AGENTS.md Colony loop: ONE startup_panel call, escalation rules for attention_inbox / task_ready_for_agent / search, legacy sweep deprecated (agents-contract test updated mentions kept in required order).
- buildServer instruments registrations (gateToolRegistration onRegister) into ToolRegistrationStats; savings_report emits registration_cost {profile, tool_count, name_description_tokens} (schema-inclusive budgets stay in tool-budget.test.ts).

Verification: pnpm typecheck/lint/test/build green; startup-panel + server suites extended.
Loading