feat: per-runtime telemetry adapters + per-agent token attribution by josstei · Pull Request #68 · josstei/maestro-orchestrate

josstei · 2026-04-27T02:16:35Z

Summary

Implements Phase 4 of the gemini-runtime-hardening plan: per-runtime telemetry adapters and per-agent token attribution in transition_phase. Closes finding F2 (token_usage was zero across 5 invocations in the captured Gemini test run).

Stacks on PR #67 (Phase 2: phase.kind discriminator). Once #67 merges, this PR's base will move automatically.

What changed

TelemetryAdapter contract (src/platforms/shared/adapters/telemetry-adapter-{types,factory}.js):

Canonical Usage shape { input, output, cached } matches the existing transition_phase token_usage parameter — adapter outputs flow into the session-state aggregator without translation.
defineTelemetryAdapter({ runtime, extractUsage, isAvailable }) enforces typed runtime, function signatures, and defensive ZERO_USAGE fallback when underlying extractUsage returns malformed shape.
isTelemetryUsage predicate, TELEMETRY_USAGE_FIELDS and TELEMETRY_RUNTIMES exposed as frozen constants.

Per-runtime adapters (src/platforms/{claude,codex,gemini,qwen}/telemetry-adapter.js):

Runtime	Status	Source
Claude	Real	Anthropic SDK Usage envelope (input_tokens, output_tokens, cache_read_input_tokens + cache_creation_input_tokens)
Codex	Real	OpenAI-style prompt_tokens/completion_tokens with input_tokens/output_tokens fallback for older CLI versions; cached_tokens
Gemini	Stub	`isAvailable: false` — Gemini CLI's invoke_agent envelope returns only textual output. Follow-up candidates documented inline.
Qwen	Stub	Same architecture as Gemini, same TODO.

Each runtime-config.js now exposes its adapter via a canonical telemetry field so the orchestrator has one resolution surface.

Generator allowlist (src/generator/payload-builder.js):

RUNTIME_PAYLOAD_FILES constant (runtime-config.js, telemetry-adapter.js) replaces the single hardcoded path. Adding a future per-runtime surface (e.g., a tracing-adapter.js) is one line.
Required so runtime-config.js's require('./telemetry-adapter') resolves in detached payloads (claude/, plugins/maestro/) — without this, generated mirrors would crash at load time.

Per-agent attribution in transition_phase (src/mcp/handlers/session-state-tools.js):

New optional agent_name parameter (string for solo phases or array for multi-agent batches). Falls back to phase.agents from session state, then to 'unknown' when no attribution is available.
Multi-agent phases split usage equally with Math.floor. The total_input/output/cached fields preserve exact totals so the per-agent sum may underattribute by up to (n−1) tokens — acceptable for budget tracking.
Defensive init for legacy state shapes: missing token_usage block, missing by_agent field, or by_agent mutated into an array all fall back to empty {} without throwing.
Number(...) || 0 coercion protects against malformed counts in legacy YAML.

Tool schema (src/mcp/tool-packs/session/index.js):

agent_name added with oneOf: [string, array] so clients reading tools/list discover the parameter.

Orchestration steps (src/references/orchestration-steps.md):

Step 25 now instructs the orchestrator to load the runtime's telemetry adapter via runtime-config.telemetry, call extractUsage(invocationResult), and pass the result as token_usage plus agent_name. Adapters reporting isAvailable: false (Gemini/Qwen) signal the orchestrator to OMIT token_usage rather than recording zeros.

Test plan

Telemetry contract: 23 unit tests in tests/unit/telemetry-adapter-factory.test.js — frozen constants, factory validation, isTelemetryUsage edge cases, malformed-extractUsage fallback, isAvailable error containment.
Runtime adapters: 18 unit tests in tests/unit/runtime-telemetry-adapters.test.js — Claude full extraction, Codex prompt_tokens preferred over input_tokens, Gemini/Qwen always-zero stubs, runtime-config telemetry export.
Generator allowlist: payload-builder tests updated to assert RUNTIME_PAYLOAD_FILES is frozen and contains both runtime-config.js and telemetry-adapter.js.
Per-agent attribution integration: 10 tests in tests/integration/telemetry-attribution.test.js — single string, array split, phase.agents fallback, empty-array regression guard, "unknown" fallback, missing by_agent, missing token_usage, array by_agent, omitted token_usage, Math.floor underattribution.
Full suite: 1230 passing (was 1177 on Phase 2 base).
just check-layers clean.
just check zero drift.

Verification limit (mention for reviewers)

The Claude/Codex extraction shapes are plan-assumptions, not live-verified against captured SDK responses. The contract structure is what this PR ships; empirical validation that real invocations produce non-zero token_usage requires a follow-up integration run. The Phase 1 fixture pattern (tests/fixtures/runtime-contracts/...) could host real Anthropic/Codex usage envelopes for adapter tests — proposed as a Phase 4 follow-up.

Findings addressed

F2 — token_usage: { total_input: 0, total_output: 0, total_cached: 0 } across 5 invocations in the original test run. Claude and Codex now extract real usage. Gemini and Qwen explicitly report unavailable so the orchestrator omits token_usage rather than recording false zeros.

Cross-runtime coverage

Runtime	Telemetry adapter	runtime-config exposure
Claude	Real (Anthropic SDK Usage)	✓
Codex	Real (OpenAI-style + fallback)	✓
Gemini	Stub (`isAvailable: false`)	✓
Qwen	Stub (`isAvailable: false`)	✓

… shape Adds canonical Usage shape (input, output, cached) matching the existing transition_phase token_usage parameter so adapter outputs flow into the session-state aggregator without translation. Adds defineTelemetryAdapter factory with strict spec validation (runtime in TELEMETRY_RUNTIMES, function signatures) and defensive ZERO_USAGE fallback when underlying extractUsage returns malformed shape.

Adds real telemetry adapters for Claude (Anthropic SDK usage envelope) and Codex (OpenAI-style prompt_tokens/completion_tokens with input_tokens/output_tokens fallback for older CLI versions). Adds stub adapters for Gemini and Qwen reporting isAvailable=false until a real telemetry source is identified. Each runtime-config.js now exposes its telemetry adapter via a canonical `telemetry` field so the orchestrator has one resolution surface. Extends generator allowlist with RUNTIME_PAYLOAD_FILES so runtime-specific surfaces beyond runtime-config.js (now telemetry, in the future tracing or others) flow into detached payloads alongside the config file. Without this, runtime-config's require('./telemetry-adapter') would fail to resolve in mirrors.

…t attribution Extends the token_usage accumulator to attribute spend per agent. Adds optional agent_name parameter to transition_phase (string for solo phases or array for multi-agent batches). Falls back to phase.agents from session state, then to "unknown" when no attribution is available. Multi-agent phases split usage equally with Math.floor; the total_input/output/cached fields preserve exact totals so the per-agent sum may underattribute by up to (n-1) tokens — acceptable for budget tracking. Defensive init for legacy state shapes: missing token_usage block, missing by_agent field, or by_agent that mutated into an array all fall back to empty {} without throwing. Numeric coercion via Number(...) || 0 protects against malformed counts in legacy YAML. Adds agent_name to the transition_phase input schema with oneOf {string, array} so clients reading tools/list discover the parameter. Updates orchestration-steps step 25 with telemetry guidance: load the runtime's telemetry adapter from runtime-config.telemetry, call extractUsage on the invocation result, omit token_usage entirely when isAvailable returns false (Gemini/Qwen stub case). Adds an `phaseAgents` option to the prepareSession test helper for multi-agent fixtures via direct state mutation, since create_session input only supports a single agent per phase.

Base automatically changed from feat/phase-kind-discriminator to refactor/mcp-handlers-complexity April 27, 2026 02:18

josstei added 3 commits April 26, 2026 22:20

josstei force-pushed the feat/telemetry-adapter branch from c5fe4d0 to e5ce06f Compare April 27, 2026 02:20

josstei merged commit 2110955 into refactor/mcp-handlers-complexity Apr 27, 2026
2 checks passed

josstei deleted the feat/telemetry-adapter branch April 27, 2026 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-runtime telemetry adapters + per-agent token attribution#68

feat: per-runtime telemetry adapters + per-agent token attribution#68
josstei merged 3 commits into
refactor/mcp-handlers-complexityfrom
feat/telemetry-adapter

josstei commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

josstei commented Apr 27, 2026

Summary

What changed

Test plan

Verification limit (mention for reviewers)

Findings addressed

Cross-runtime coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant