diff --git a/PROTOCOL.md b/PROTOCOL.md index 29ea314..3f657bd 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1,7 +1,7 @@ -# Agent Chorus Protocol v0.14.0 +# Agent Chorus Protocol v0.16.0 ## Purpose -Define a lightweight, local-first standard for reading and coordinating cross-agent session evidence across Codex, Gemini, Claude, Cursor, and Hermes. +Define a lightweight, local-first standard for reading and coordinating cross-agent session evidence across Codex, Gemini, Claude, Cursor (both cursor-agent CLI transcripts and Cursor IDE chat store), and Hermes. ## Tenets 1. Local-first: read local session logs only by default. @@ -20,22 +20,28 @@ Define a lightweight, local-first standard for reading and coordinating cross-ag ### Dual-implementation commands (Node + Rust parity required) ```bash -chorus read --agent [--id=] [--cwd=] [--chats-dir=] [--last=] [--json] [--metadata-only] [--audit-redactions] +chorus read --agent [--id=] [--cwd=] [--chats-dir=] [--last=] [--include-user] [--tool-calls] [--history=] [--format=] [--json] [--metadata-only] [--audit-redactions] chorus compare --source ... [--cwd=] [--last=] [--json] chorus report --handoff [--cwd=] [--json] chorus list --agent [--cwd=] [--limit=] [--json] chorus search --agent [--cwd=] [--limit=] [--json] chorus diff --agent --from --to [--cwd=] [--last=] [--json] +chorus summary --agent [--cwd=] [--format=] [--json] +chorus timeline [--agent ]... [--cwd=] [--limit=] [--format=] [--json] chorus relevance --list | --test | --suggest [--cwd=] [--json] chorus send --from --to --message [--cwd=] chorus messages --agent [--cwd=] [--clear] [--json] -chorus agent-context [--ci] [--base=] [--json] +chorus checkpoint --from [--cwd=] [--message=] [--json] +chorus agent-context [--ci] [--base=] [--enforce-separate-commits] [--json] chorus teardown [--cwd=] [--dry-run] [--global] [--json] ``` -### Node-only administrative commands +### Setup and doctor (dual-parity since v0.13.0) -The following commands are provided by the Node CLI only. They are not part of the dual-parity contract and are not implemented in the Rust CLI: +`setup` and `doctor` were Node-only in v0.7. As of v0.13.0 they have +full Node+Rust parity (byte-identical JSON for the same inputs) and are +part of the dual-implementation contract enforced by +`scripts/conformance.sh`: ```bash chorus setup [--cwd=] [--dry-run] [--force] [--agent-context] [--json] @@ -59,12 +65,20 @@ Rules: 14. `messages` reads (and optionally clears with `--clear`) the message queue for an agent. 15. `teardown` removes managed blocks from provider files, deletes `.agent-chorus/` directory, removes `.agent-chorus/` from `.gitignore`, and removes hook sentinels. `--dry-run` previews without changes. `--global` also removes `~/.cache/agent-chorus/`. The Claude Code plugin is NOT removed by teardown. 16. `setup` creates `.agent-chorus/` scaffolding, injects managed blocks into CLAUDE.md/AGENTS.md/GEMINI.md, appends `.agent-chorus/` to `.gitignore`, and auto-installs the Claude Code skill plugin if the `claude` CLI is present. `--agent-context` runs `init` + `install-hooks`. Safe to re-run; idempotent unless `--force` is given. -17. `doctor` checks: version, session directory availability, setup completeness, provider instruction wiring, session discoverability, context pack state, Claude Code plugin installation, and update status. +17. `doctor` checks: version, session directory availability, setup completeness, provider instruction wiring, session discoverability (codex / claude / gemini / cursor-cli / cursor-app / hermes), context pack state, git hook state, env-override health, snippet / managed-block freshness, Claude Code plugin installation, and update status. +18. **`read --history` (v0.16.0)**: takes one of `on-demand` (default), `none`, or `eager`. `on-demand` returns only the latest session for the cwd — chorus does NOT auto-pull prior sessions; consumers call `chorus list / timeline / search` explicitly when they need historical context. `none` is equivalent to `--metadata-only`. `eager` is reserved for a future multi-session merge; it currently behaves identically to `on-demand` AND pushes a warning into `warnings[]` so consumers cannot silently rely on it. Any other value MUST exit non-zero with `Invalid --history value: . Allowed: on-demand | none | eager.` on both runtimes. +19. **`read` cwd-fallback contract (v0.16.0)**: when `--cwd ` was passed but no session matched and the adapter fell back to the latest session, the JSON output MUST set `cwd_mismatch: true` AND the fallback warning string MUST be mirrored to stderr prefixed with `chorus: `. `cwd_mismatch` is only present when the fallback fires — it MUST NOT be emitted as `false`. Schema: `schemas/read-output.schema.json`. +20. **`--tool-calls` uniform NOT_AVAILABLE warning (v0.16.0)**: for agents whose transcript format does not carry tool-call structure (currently `gemini` and `hermes`), passing `--tool-calls` MUST run without error, MUST still set `included_tool_calls: true` (the flag was honored), and MUST push this exact warning into `warnings[]`: `--tool-calls has no effect for sessions: this agent's transcript format does not carry tool calls.` The phrasing is byte-identical between Node and Rust dispatch. +21. **`--history` and `cwd_mismatch` are dual-runtime contract (v0.16.0)**: both fields and their semantics are gated by `scripts/conformance.sh`. Any change to the allowed `--history` values, the eager warning string, the cwd-fallback warning string, or the `cwd_mismatch` boolean emission rule requires updating both runtimes, both schemas (where relevant), and golden fixtures in the same PR. +22. **Unknown-flag rejection (F11, v0.16.0)**: both runtimes MUST fail closed on unknown flags. Per-command allowlists live in `cli/src/main.rs` (clap) and `scripts/read_session.cjs:ALLOWED_FLAGS`. Unknown flags MUST exit non-zero with an error that names the offending flag and subcommand. +23. **Search invariant (v0.16.0)**: for every adapter, `read(text) ⊆ search(tokens-from-text)` MUST hold. If `chorus read --agent ` returns content for a session, `chorus search --agent ` with tokens from that content MUST return that session. Enforced in `scripts/conformance.sh` for claude, codex, gemini, cursor (both CLI and IDE app surfaces), and hermes. +24. **Cursor IDE app surface (v0.16.0)**: chorus reads Cursor sessions from BOTH `~/.cursor/projects//agent-transcripts//*.jsonl` (CLI surface) AND `~/.cursor/chats///store.db` (SQLite, IDE app surface). `chorus list --agent cursor` and `chorus search --agent cursor` entries carry a cursor-only `source: "cli" | "app"` string field; other agents' list/search entries MUST NOT emit this field. The Node CLI requires Node >= 22.5 for the IDE app surface (via `node:sqlite`); on older Node, the IDE surface is gracefully omitted rather than failing. ## JSON Output Contract (`chorus read --json`) ```json { + "chorus_output_version": 1, "agent": "codex", "source": "/absolute/path/to/session-file", "content": "last assistant/model turn or fallback text", @@ -73,6 +87,9 @@ Rules: "timestamp": "2026-02-08T15:30:00Z", "message_count": 10, "messages_returned": 1, + "included_roles": ["assistant"], + "included_tool_calls": false, + "cwd_mismatch": true, "warnings": [ "Warning: no Codex session matched cwd /path; falling back to latest session." ] @@ -80,12 +97,23 @@ Rules: ``` Schema is defined in `schemas/read-output.schema.json`. -`chorus list --json` and `chorus search --json` outputs are defined by `schemas/list-output.schema.json`. + +**v0.16.0 conditional fields on `read --json`:** + +| Field | Type | Emitted when | +|---|---|---| +| `included_roles` | `string[]` | `--include-user` was passed (otherwise omitted; assistant-only is the default). | +| `included_tool_calls` | `boolean` | `--tool-calls` was passed. Set to `true` even for agents whose transcript format carries no tool calls (`gemini`, `hermes`) — the flag was honored even if the data isn't structurally available; a uniform warning is added to `warnings[]` in that case. | +| `cwd_mismatch` | `boolean` (always `true` when present) | `--cwd ` was passed but no session matched and the adapter fell back to the latest session. NOT emitted as `false`; absence means "no fallback occurred". The matching warning is also mirrored to stderr prefixed with `chorus: `. | +| `redactions` | `object[]` | `--audit-redactions` was passed. Each entry is `{pattern, count}`. | + +`chorus list --json` and `chorus search --json` outputs are defined by `schemas/list-output.schema.json`. Entries for `--agent cursor` carry an extra string field `source: "cli" | "app"` (cursor-only; absent for other agents) distinguishing the cursor-agent CLI transcript surface from the Cursor IDE `store.db` surface. + Errors with `--json` are defined by `schemas/error.schema.json`. `chorus search --json` results include a `match_snippet` field showing a ~120-character context window around the first match. `chorus report --json` outputs the coordinator report object defined by `schemas/report.schema.json`. -`chorus report --handoff` consumes packets defined by `schemas/handoff.schema.json`. +`chorus report --handoff` consumes packets defined by `schemas/handoff.schema.json`; the full handoff shape is also surfaced inline in `chorus report --help` as of v0.16.0. `chorus messages --json` outputs an array of message objects defined by `schemas/message.schema.json`. ## Agent-to-Agent Messaging @@ -117,27 +145,31 @@ Implementations must redact likely secrets from returned content before printing - `CHORUS_CODEX_SESSIONS_DIR` - `CHORUS_GEMINI_TMP_DIR` - `CHORUS_CLAUDE_PROJECTS_DIR` -- `CHORUS_CURSOR_DATA_DIR` (cursor-agent projects root; default `~/.cursor/projects`) +- `CHORUS_CURSOR_DATA_DIR` (cursor-agent CLI projects root; default `~/.cursor/projects`) +- `CHORUS_CURSOR_APP_DATA_DIR` (Cursor IDE chat-store root; default `~/.cursor/chats`; v0.16.0) - `CHORUS_HERMES_DATA_DIR` (provisional Hermes sessions root; default `~/.hermes/sessions`) - `CHORUS_SKIP_UPDATE_CHECK` +Every `CHORUS_*` variable has a backward-compatible `BRIDGE_*` alias (e.g. `BRIDGE_CURSOR_APP_DATA_DIR`); when both are set, `CHORUS_*` wins. `chorus doctor` emits `env_override_dangling: warn` when any of these point at a non-existent directory. + ## Doctor Contract -`chorus doctor --json` may include: -```json -{ - "update": { - "available": true, - "current": "0.7.0", - "latest": "0.7.1", - "checked_at": "2026-02-15T..." - }, - "context_pack_state": { - "valid": true, - "last_modified": "..." - } -} -``` +`chorus doctor --json` returns `{ cwd, overall, checks: [...] }` where +each `check` is `{ id, status, detail }`. As of v0.16.0, `status` is one +of four values: + +| Severity | Meaning | Elevates `overall`? | +|---|---|---| +| `pass` | Check passed. | no | +| `info` | Informational state — typically "this feature is intentionally not configured" (e.g. Hermes not installed; cwd is not a git repo). | **no** | +| `warn` | Soft failure — misconfigured but install still works. | yes → `overall: warn` | +| `fail` | Hard failure — install is broken or an adapter errored. | yes → `overall: fail` | + +`overall` is computed as `fail` if any check is `fail`, else `warn` if any check is `warn`, else `pass`. `info` does NOT elevate `overall`. + +The exit code is `0` when `overall ∈ {pass, warn}` and non-zero when `overall == fail`. Tooling that wants to catch all non-perfect states should compare `overall != "pass"`. + +The check IDs (v0.16.0) include: `version`, `codex_sessions_dir`, `claude_projects_dir`, `gemini_tmp_dir`, `setup_intents`, `snippet_`, `integration_`, `sessions_codex`, `sessions_claude`, `sessions_gemini`, `sessions_cursor_cli` (replaces `sessions_cursor`), `sessions_cursor_app` (new), `sessions_hermes` (now downgrades to `info` when data dir absent), `env_override_dangling` (new), `snippet__stale` (new — pre-v0.16.0 history-contract snippet), `integration__stale` (new — pre-v0.16.0 history-contract managed block), `context_pack_state`, `context_pack_guidance`, `context_pack_hooks_path` (now reports `info` when cwd is not a git repo), `context_pack_pre_push` (same — `info` outside a git repo), `update_status`, and `claude_plugin`. See `docs/CLI_REFERENCE.md` for the full catalogue with per-check severity rules. ## Conformance -Any release must pass `scripts/conformance.sh`, which runs both implementations against shared fixtures and verifies equivalent JSON output for `read`, `compare`, `report`, `list`, `search`, `diff`, `relevance`, `send`, `messages`, and `teardown`. +Any release must pass `scripts/conformance.sh`, which runs both implementations against shared fixtures and verifies equivalent JSON output for `read`, `compare`, `report`, `list`, `search`, `diff`, `summary`, `timeline`, `relevance`, `send`, `messages`, `checkpoint`, `setup`, `doctor`, and `teardown`. v0.16.0 added a `search-read-parity` gate that enforces `read(text) ⊆ search(tokens-from-text)` for every adapter (claude, codex, gemini, cursor-cli, cursor-app, hermes). diff --git a/README.md b/README.md index fe9c971..b3f6abd 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,14 @@ ![CI Status](https://github.com/cote-star/agent-chorus/actions/workflows/ci.yml/badge.svg) ![License](https://img.shields.io/badge/license-MIT-blue.svg) -![Version](https://img.shields.io/badge/version-0.14.1-green.svg) +![Version](https://img.shields.io/badge/version-0.16.0-green.svg) [![Star History](https://img.shields.io/github/stars/cote-star/agent-chorus?style=social)](https://github.com/cote-star/agent-chorus) **Let your AI agents talk about each other.** Ask one agent what another is doing, and get an evidence-backed answer. No copy-pasting, no tab-switching, no guessing. -> If you use 2+ AI coding agents (Codex, Claude, Gemini, Cursor), Chorus gives them shared visibility — no orchestrator required. +> If you use 2+ AI coding agents (Codex, Claude, Gemini, Cursor CLI, Cursor IDE), Chorus gives them shared visibility — no orchestrator required. ![Before/after workflow](docs/silo-tax-before-after.webp) @@ -35,405 +35,127 @@ Three agents working on checkout. You ask Codex what the others are doing. ![Status Check Demo](docs/demo-status.webp) -### What You Get Back +### From Zero to a Working Query -Every response is structured, source-tracked, and redacted: - -```bash -chorus read --agent codex --include-user --json -``` - -```json -{ - "agent": "codex", - "session_id": "session-abc123", - "content": "USER:\nInvestigate the auth regression...\n---\nASSISTANT:\nI am tracing the auth middleware...", - "timestamp": "2026-03-12T10:30:00Z", - "message_count": 12, - "source": "/home/user/.codex/sessions/2026/03/12/session-abc123.jsonl" -} -``` - -Source file, session ID, and timestamp on every response. Secrets auto-redacted before output. - -Prefer markdown over JSON for human-facing output: - -```bash -chorus read --agent codex --include-user --format markdown -``` +`chorus setup` wires every agent on the box in under a minute. -Full JSON schema and field reference: [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md) +![Setup Demo](docs/demo-setup.webp) ## Quick Start -### 1. Install - ```bash -npm install -g agent-chorus # requires Node >= 18 +# 1. Install +npm install -g agent-chorus # requires Node >= 18 # or -cargo install agent-chorus # requires Rust >= 1.74 -``` +cargo install agent-chorus # requires Rust >= 1.74 -### 2. Setup +# 2. Wire your agents +chorus setup # patches CLAUDE.md / GEMINI.md / AGENTS.md, adds .gitignore entries +chorus doctor # verify session paths, provider wiring, updates -```bash -chorus setup -chorus doctor # Check session paths, provider wiring, and updates +# 3. Ask any agent in natural language +# "What is Claude doing?" / "Compare Codex and Gemini outputs." / "Pick up where Gemini left off." ``` -`setup` also appends `.agent-chorus/` to `.gitignore` automatically and, if the `claude` CLI is present, installs the Agent Chorus Claude Code plugin. - -From zero to a working skill query in under a minute: - -![Setup Demo](docs/demo-setup.webp) - -This wires skill triggers into your agent configs (`CLAUDE.md`, `GEMINI.md`, `AGENTS.md`) so agents know how to use chorus. - -To cleanly reverse everything setup does (managed blocks, scaffolding, hooks): +Or call chorus directly: ```bash -chorus teardown # reverse setup for this project -chorus teardown --global # also remove ~/.cache/agent-chorus/ +chorus read --agent codex --include-user --json ``` -### Claude Code integration (optional hook) - -Wire a `SessionEnd` hook so Claude Code automatically broadcasts its -state to every other agent's inbox when a session ends — even on crash -or force-close. Add to `~/.claude/settings.json`: +Every response is structured, source-tracked, and redacted: ```json { - "hooks": { - "SessionEnd": [{ - "hooks": [ - { - "type": "command", - "command": "bash /absolute/path/to/agent-chorus/scripts/hooks/chorus-session-end.sh", - "timeout": 10 - } - ] - }] - } + "agent": "codex", + "session_id": "session-abc123", + "content": "USER:\nInvestigate the auth regression...\n---\nASSISTANT:\nI am tracing the auth middleware...", + "timestamp": "2026-06-02T10:30:00Z", + "message_count": 12, + "source": "/home/user/.codex/sessions/2026/06/02/session-abc123.jsonl" } ``` -The script delegates to `chorus checkpoint --from claude` and no-ops -silently on projects without `.agent-chorus/`, so it is safe to install -globally. Full wiring details: -[`docs/session-handoff-guide.md`](./docs/session-handoff-guide.md). - -### 3. Ask - -Tell any agent: - -> "What is Claude doing?" -> "Compare Codex and Gemini outputs." -> "Pick up where Gemini left off." +Source file, session ID, and timestamp on every response. Secrets auto-redacted before output. Prefer `--format markdown` for human review. -The agent runs chorus commands behind the scenes and gives you an evidence-backed answer. +To reverse everything `setup` did: `chorus teardown` (add `--global` to also drop `~/.cache/agent-chorus/`). -
Session selection behavior +## What's New in v0.16.0 -After `chorus setup`, provider instructions follow this behavior: +- **Cursor IDE adapter.** Chorus now reads both the `cursor-agent` CLI transcripts *and* Cursor IDE app sessions through one adapter. If you use the Cursor app, your sessions are now first-class. +- **`--history=on-demand` default.** `chorus read` now returns just the latest session for the current `cwd`. Closes the 2.5x token-inflation issue measured in the v0.15 field study. Provider snippets carry the contract so consumer agents inherit it automatically. +- **`cwd_mismatch` is now explicit.** When `--cwd` matches no session, the output says so. No more silent fallbacks that read like real data. +- **Doctor honesty pass.** New `info` severity, env-var dangling-path detection, git-aware hooks checks, stale-snippet detection. Doctor tells the truth or stays quiet. +- **Codex search parity fix.** `chorus search --agent codex` no longer silently returns empty. The `read ⊆ search` invariant is now enforced for every adapter. +- **`--help` overhaul.** Per-subcommand help leads with that subcommand. `chorus report --help` ships a copy-pasteable handoff JSON schema. -- If no session is specified, read the latest session in the current project. -- "past session" / "previous session" means one session before latest. -- "last N sessions" includes latest. -- "past N sessions" excludes latest (older N sessions). -- Ask for a session ID only if initial fetch fails or exact ID is explicitly requested. - -
+Full changelog and upgrade notes: [`RELEASE_NOTES.md`](./RELEASE_NOTES.md). ## How It Works -1. **Ask naturally** - "What is Claude doing?" / "Did Gemini finish the API?" -2. **Agent runs chorus** - Your agent calls `chorus summary`, `chorus read`, `chorus timeline`, `chorus compare`, `chorus search`, `chorus diff`, `chorus send`, `chorus messages`, `chorus checkpoint`, etc. behind the scenes. -3. **Evidence-backed answer** - Sources cited, divergences flagged, no hallucination. +1. **Ask naturally** — "What is Claude doing?" / "Did Gemini finish the API?" +2. **Your agent runs chorus** — `chorus summary`, `read`, `timeline`, `compare`, `search`, `diff`, `send`, `messages`, `checkpoint`, etc. +3. **Evidence-backed answer** — sources cited, divergences flagged, no hallucination. **Tenets:** -- **Local-first** - reads directly from agent session logs on your machine. No data leaves. -- **Evidence-based** - every claim tracks to a specific source session file. -- **Privacy-focused** - automatically redacts API keys, tokens, and passwords. -- **Dual parity** - ships Node.js + Rust CLIs with identical output contracts. - -## Real-World Recipes +- **Local-first** — reads agent session logs directly on your machine. No data leaves. +- **Evidence-based** — every claim tracks to a specific source session file. +- **Privacy-focused** — auto-redacts API keys, tokens, and passwords. +- **Dual parity** — Node.js + Rust CLIs ship identical output contracts, conformance-tested against shared fixtures. -### Quick Status Check +## Key Capabilities -What is Claude working on right now? Get a structured digest — files touched, tools used, duration — without reading the full session. No LLM calls. +A taste — see [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md) for the full surface. ```bash +# Structured digest — files, tools, duration. No LLM calls. chorus summary --agent claude --cwd . --json -``` - -### Cross-Agent Timeline -See what every agent did across your project, in chronological order. - -```bash +# Chronological view across every agent on the project chorus timeline --cwd . --format markdown -``` - -### Tool Call Forensics - -See exactly which files an agent read and edited — not just what it said. -```bash +# What an agent actually touched (Read/Edit/Bash/Write) chorus read --agent codex --tool-calls --json -``` - -### Handoff Recovery - -Gemini crashed mid-task. Claude picks up where it left off, with full context. - -```bash -chorus read --agent gemini --cwd . --include-user --json -``` - -### Cross-Agent Verification - -Codex says it fixed the payment bug. Verify against Claude's analysis before deploying. -```bash +# Verify one agent's claim against another chorus compare --source codex --source claude --cwd . --json -``` - -### Security Audit -Check what secrets appeared in agent sessions and were redacted. - -```bash +# Audit what got redacted, and why chorus read --agent claude --audit-redactions --json -``` - -### Agent Coordination - -Tell Codex the auth module is ready — without switching tabs. -```bash -chorus send --from claude --to codex --message "auth module ready for review" --cwd . +# Coordinate without switching tabs +chorus send --from claude --to codex --message "auth module ready" --cwd . chorus messages --agent codex --cwd . --json -``` -### Session Handoff - -Broadcast where you left off to every other agent before ending a -session. Auto-composes branch, uncommitted-file count, and last commit; -override with `--message` when you want custom text. - -```bash +# Broadcast where you left off before ending a session chorus checkpoint --from claude --cwd . -chorus checkpoint --from codex --message "auth refactor half-done; types still broken" --cwd . ``` -Pair with `chorus messages --agent --clear` at standup to read -and drain the inbox other agents left you. Full protocol: -[`docs/session-handoff-guide.md`](./docs/session-handoff-guide.md). - -## Supported Agents - -Full multi-agent coverage. No other tool matches this breadth across 4 agents and 12 capabilities. +Supported agents: **Codex, Claude, Gemini, Cursor CLI, Cursor IDE.** Full capability matrix in [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md). -| Feature | Codex | Gemini | Claude | Cursor | -| :-------------------- | :---: | :----: | :----: | :----: | -| **Read Content** | Yes | Yes | Yes | Yes | -| **Session Summary*** | Yes | Yes | Yes | Yes | -| **Timeline*** | Yes | Yes | Yes | Yes | -| **Auto-Discovery** | Yes | Yes | Yes | Yes | -| **CWD Scoping** | Yes | No | Yes | No | -| **List Sessions** | Yes | Yes | Yes | Yes | -| **Search** | Yes | Yes | Yes | Yes | -| **Comparisons** | Yes | Yes | Yes | Yes | -| **Session Diff** | Yes | Yes | Yes | Yes | -| **Redaction Audit** | Yes | Yes | Yes | Yes | -| **Messaging** | Yes | Yes | Yes | Yes | -| **Session Handoff** | Yes | Yes | Yes | Yes | - -*\*Added in v0.11.0. As of v0.13.0, all four features have Rust parity and are conformance-tested against shared golden fixtures. `--tool-calls` on Gemini and Cursor is a no-op in both runtimes — those adapters don't yet parse a tool-call schema.* - -Both Node.js and Rust implementations pass identical conformance tests against shared fixtures for every command listed above. - -## Key Capabilities - -### Session Summary - -Structured session digest — files touched, tools used, duration — without reading the full content. No LLM calls required. - -```bash -chorus summary --agent claude --cwd . --json -``` - -### Cross-Agent Timeline - -Chronological view interleaving sessions from multiple agents. See what happened across your entire project. - -```bash -chorus timeline --cwd . --agent claude --agent codex --limit 5 --json -``` - -### Tool Call Visibility - -Surface every `Read`, `Edit`, `Bash`, and `Write` call an agent made — not just the text it produced. - -```bash -chorus read --agent codex --tool-calls --json -``` - -### Markdown Output - -Render any read, summary, or timeline as formatted markdown instead of JSON. Useful for demos, docs, and human review. - -```bash -chorus summary --agent claude --format markdown -``` - -### Session Diff - -Compare two sessions from the same agent with line-level precision. - -```bash -chorus diff --agent codex --from session-abc --to session-def --cwd . --json -``` - -### Redaction Audit Trail - -See exactly what was redacted and why in any session read. - -```bash -chorus read --agent claude --audit-redactions --json -``` - -### Agent-to-Agent Messaging - -Agents leave messages for each other through a local JSONL queue. - -```bash -chorus send --from claude --to codex --message "auth module ready for review" --cwd . -``` - -### Session Handoff Protocol - -Standup + conclude rituals so cross-agent messaging actually gets used. -At standup, drain the inbox. At conclude, either `chorus send` a -targeted note or `chorus checkpoint` a state broadcast to every other -agent. - -```bash -# At standup — read and drain -chorus messages --agent claude --clear --cwd . - -# At conclude — broadcast state -chorus checkpoint --from claude --cwd . -``` - -`chorus checkpoint` is idempotent and no-ops silently when -`.agent-chorus/` is absent. For Claude Code, wire -`scripts/hooks/chorus-session-end.sh` into `~/.claude/settings.json` so -an abrupt session end still leaves a checkpoint. Gemini sessions stored -as protobuf (`.pb`) fall back via `--chats-dir` — see -[`docs/session-handoff-guide.md`](./docs/session-handoff-guide.md) for -the full recipe. - -### Relevance Introspection +## Context Pack -Inspect and test the agent-context filtering patterns that decide which files matter. +A context pack is an agent-first, token-efficient repo briefing for end-to-end understanding tasks. Instead of re-reading the full repository on every request, agents start from `.agent-context/current/` and open project files only when needed. Local-first, no need to make your repo public. ```bash -chorus relevance --list --cwd . # Show current include/exclude patterns -chorus relevance --test src/main.rs --cwd . # Test if a file matches +chorus agent-context init # creates .agent-context/current/ with templates +# ...agent fills in the sections... +chorus agent-context seal # validates and locks the pack ``` -Full flag reference and JSON output schemas: [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md) +Ask your agent: *"Understand this repo end-to-end using the context pack first, then deep dive only where needed."* -## How It Compares +![Context Pack Read-Order](docs/cold-start-agent-context-hero.webp) -| | agent-chorus | CrewAI / AutoGen | ccswarm / claude-squad | -| :--- | :---: | :---: | :---: | -| **Approach** | Read-only evidence layer | Full orchestration framework | Parallel agent spawning | -| **Install** | `npm i -g agent-chorus` or `cargo install` | pip + ecosystem | git clone | -| **Agents** | Codex, Claude, Gemini, Cursor | Provider-specific | Usually Claude-only | -| **Dependencies** | Zero npm prod deps | Heavy Python/TS stack | Moderate | -| **Privacy** | Local-first, auto-redaction | Cloud-optional | Varies | -| **Session summaries** | Built-in (no LLM) | None | None | -| **Cross-agent timeline** | Built-in | None | None | -| **Markdown output** | Built-in | N/A | None | -| **Cold-start solution** | Context Pack (5-doc briefing) | None | None | -| **Language** | Node.js + Rust (conformance-tested) | Python or TypeScript | Single language | -| **Agent messaging** | Built-in JSONL queue | Framework-specific | None | -| **Session handoff protocol** | Built-in checkpoint + hook | None | None | -| **Philosophy** | Visibility first, orchestration optional | Orchestration first | Task spawning | +CI gate: `chorus agent-context verify --ci` exits non-zero if the pack is stale or corrupt. Internals, sync policy, enforcement: [`AGENT_CONTEXT.md`](./AGENT_CONTEXT.md). -## Architecture +## Architecture, in one diagram -Chorus sits between your agent and other agents' session logs. The workflow is evidence-first: one agent reads another agent's session evidence and continues with a local decision, without a central control plane. +Chorus sits between your agent and other agents' session logs. Read-only, evidence-first, no central control plane. ![Claude to Codex handoff via read-only evidence](docs/orchestrator-handoff-flow.svg) -```mermaid -sequenceDiagram - participant User - participant Agent as Your Agent (Codex, Claude, etc.) - participant Chorus as chorus CLI - participant Sessions as Other Agent Sessions - - User->>Agent: "What is Claude doing?" - Agent->>Chorus: chorus read --agent claude --include-user --json - Chorus->>Sessions: Scan ~/.claude/projects/*.jsonl - Sessions-->>Chorus: Raw session data - Chorus->>Chorus: Redact secrets, format - Chorus-->>Agent: Structured JSON - Agent-->>User: Evidence-backed natural language answer -``` - -
Diagram not rendering? View as image - -![Architecture sequence diagram](docs/architecture.svg) - -
- -### Current Boundaries - -- No orchestration control plane: no task router, scheduler, or work queues. -- No autonomous agent chaining by default; handoffs are human-directed. -- No live synchronization stream; reads are snapshot-based from local session logs. - -## Context Pack - -A context pack is an agent-first, token-efficient repo briefing for end-to-end understanding tasks. -Instead of re-reading the full repository on every request, agents start from `.agent-context/current/` and open project files only when needed. -This works the same for private repositories: the pack is local-first and does not require making your code public. - -- `5` ordered docs + `manifest.json` (compact index, not a repo rewrite). -- Deterministic read order: `00` -> `10` -> `20` -> `30` -> `40`. -- Agent-maintained in the intended workflow; verify with `chorus agent-context verify`. -- CI gate available: `chorus agent-context verify --ci` for PR freshness checks. -- Local recovery snapshots with rollback support. - -```bash -# Recommended workflow: -chorus agent-context init # Creates .agent-context/current/ with templates -# ...agent fills in sections... -chorus agent-context seal # Validates content and locks the pack - -# Manual rebuild (backward-compatible wrapper) -chorus agent-context build - -# Install pre-push hook (advisory-only check on main push) -chorus agent-context install-hooks -``` - -Ask your agent explicitly: - -> "Understand this repo end-to-end using the context pack first, then deep dive only where needed." - -![Context Pack Read-Order](docs/cold-start-agent-context-hero.webp) - -![Context Pack Demo](docs/demo-agent-context.webp) - -CI gate available: `chorus agent-context verify --ci` exits non-zero if the pack is stale or corrupt — wire it into your PR checks. - -Full agent-context internals, sync policy, layered model, and enforcement details: [`AGENT_CONTEXT.md`](./AGENT_CONTEXT.md) +Boundaries: no task router, no scheduler, no autonomous chaining, no live sync stream. Snapshot-based reads from local logs, by design. ## Easter Egg @@ -441,21 +163,16 @@ Full agent-context internals, sync policy, layered model, and enforcement detail ![Trash Talk Demo](docs/demo-trash-talk.webp) -## Roadmap - -- **Context Pack customization** - user-defined doc structure, custom sections, team templates. -- **Windows installation** - native Windows support (currently macOS/Linux). -- **Cross-agent context sharing** - agents share context snippets (still read-only, still local). - ## Go Deeper | If you need... | Go here | | :--- | :--- | | Full command syntax and JSON outputs | [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md) | -| Agent-context internals and policy details | [`AGENT_CONTEXT.md`](./AGENT_CONTEXT.md) | -| Protocol and schema contract details | [`PROTOCOL.md`](./PROTOCOL.md) | -| Contributing or extending the codebase | [`docs/DEVELOPMENT.md`](./docs/DEVELOPMENT.md) / [`CONTRIBUTING.md`](./CONTRIBUTING.md) | +| Adapter formats, schema contracts, redaction rules | [`PROTOCOL.md`](./PROTOCOL.md) | +| Session handoff protocol, hooks, Gemini `.pb` fallback | [`docs/session-handoff-guide.md`](./docs/session-handoff-guide.md) | +| Agent-context internals and policy | [`AGENT_CONTEXT.md`](./AGENT_CONTEXT.md) | | Release-level changes and upgrade notes | [`RELEASE_NOTES.md`](./RELEASE_NOTES.md) | +| Contributing or extending the codebase | [`docs/DEVELOPMENT.md`](./docs/DEVELOPMENT.md) / [`CONTRIBUTING.md`](./CONTRIBUTING.md) | --- diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index bb13b13..0ced763 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -1,5 +1,91 @@ # Release Notes +## v0.16.0 — 2026-06-03 + +**UAT-driven hardening release. Cursor IDE (SQLite) is now a first-class adapter alongside the v0.15.0 cursor-agent CLI surface, the read/search/doctor contracts are tightened so silent wrong-answer modes are eliminated, and the on-demand history contract is made explicit so consumer agents stop paying a 2.5× token tax for eager prior-session reads.** + +The release closes every PRIO 1 item from the v0.15.0 UAT (`research/uat-cli-features-2026-06-03.md`) plus the Tier-1/2/3 follow-ups surfaced by the independent UAT replay (`research/uat-replay-followups-2026-06-03.md`) and one further round (R2) of fresh-context review defects. Two breaking changes (a doctor JSON id split, a behavior tightening on the Rust cwd matcher) and one new warning on previously-silent paths; everything else is additive. Conformance covers all five adapter surfaces (claude, codex, gemini, cursor CLI, cursor IDE app, hermes) at Node↔Rust byte parity. + +### Highlights + +- **Cursor IDE (app) is a real adapter.** `chorus list/read/search/timeline --agent cursor` now merges the SQLite store (`~/.cursor/chats///store.db`) with the v0.15.0 cursor-agent CLI JSONL surface into one unified cursor view. Workspace path recovered from the first user-role message header. Tool-call rendering, redaction, and Node/Rust parity all on day one. +- **`--tool-calls` is uniform across every adapter.** Cursor IDE renders `[TOOL: ...]` blocks like claude/codex/cursor-cli. Gemini and hermes — whose on-disk format does not carry tool calls — now emit an explicit `NOT_AVAILABLE` warning instead of a silent no-op. `included_tool_calls: true` still gets set, but consumers can no longer misread the silence as "this session had no tools". +- **`chorus read` defaults to `--history=on-demand`.** The flag is new; the behavior is contracted. Chorus does not eagerly pull prior sessions for the cwd — the field study (`context-pack-field-findings-2026-03-20.md` Finding 3) measured a 2.5× token inflation when agents looped over history at session start. `--history=eager` is reserved (currently behaves like on-demand and warns); `--history=none` is metadata-only. +- **Codex search returns results again.** The codex search extractor walked a schema no real codex session has ever used (`{role:"assistant", content:"..."}` instead of the actual nested `response_item`/`event_msg` envelopes), so every `chorus search --agent codex ` returned `[]`. Both runtimes now share the `parse_codex_jsonl` shape, and a new conformance invariant — `read(text) ⊆ search(text-tokens)` — enforces the contract for claude, codex, gemini, cursor CLI, and cursor IDE app. +- **Doctor stops contradicting itself and stops lying about the local install.** `info` severity replaces `warn` for optional/absent features, the hooks-path and pre-push checks share one truth source, both are git-aware (a non-git cwd no longer reports a pre-push hook as installed), and a new `env_override_dangling` warning fires when a `CHORUS_*_DIR` / `BRIDGE_*_DIR` env var points at a path that does not exist. + +### What's new + +#### Adapter coverage + +- **N1 — Cursor IDE (app) SQLite adapter.** New `cli/src/cursor_app.rs` (Rust) and `scripts/adapters/cursor_app.cjs` (Node) read SHA-256 content-addressed message blobs from `store.db` and merge into the existing cursor `list/read/search/timeline` paths. Workspace path is recovered from a `Workspace Path:` header in the first user-role message (mirroring cursor-cli's `.workspace-trusted` recovery). Doctor splits the previous `sessions_cursor` check into `sessions_cursor_cli` and `sessions_cursor_app` so the two surfaces are diagnosed independently; warning is raised only when both are empty. Node uses the built-in `node:sqlite` (Node ≥ 22.5) with a graceful fallback that leaves the surface invisible on older runtimes — the CLI/JSONL surface stays accessible regardless. +- **N6 — `--tool-calls` parity.** Cursor IDE renders `[TOOL: ]` blocks via the shared claude content extractor. Gemini and hermes (whose transcripts have no tool-call concept) now emit a uniform `"--tool-calls has no effect for sessions: this agent's transcript format does not carry tool calls."` warning. Both runtimes share an `AGENTS_WITHOUT_TOOL_CALLS` / `agent_has_no_tool_calls()` set so the policy is centrally enforced. +- **N2 — Codex search extractor parity with read.** Search now handles both real codex envelopes (`response_item` with nested `payload.content[]` and `event_msg` with `payload.message`) via the same `parse_codex_jsonl` shape used by `read`. Closes the silent-empty `chorus search --agent codex` regression that has been latent since the original codex adapter shipped. + +#### Read contract + +- **N7 — `--history=on-demand|none|eager`** on `chorus read`. `on-demand` (default) returns only the latest session for the cwd, no auto-recall. `none` is metadata-only (alias for `--metadata-only`). `eager` is reserved for a future multi-session merge and currently behaves identically to on-demand with an explicit warning so consumers cannot silently depend on a behavior chorus does not yet implement. Mirrored across Rust + Node; invalid values fail closed. +- **F1 — `cwd_mismatch` structured field.** When `--cwd` is passed but no session matches, every adapter that falls back to the latest session (codex, claude, cursor CLI, cursor IDE app, hermes) now emits `cwd_mismatch: true` on the read output and echoes a `chorus: ...` warning to stderr. JSON-only consumers can detect the fallback without parsing warning strings; humans piping stdout still see the warning. Gemini does not scope by absolute cwd so the field is never set for it. Schema added as optional boolean in `schemas/read-output.schema.json`; existing goldens unaffected. +- **F4 — `read(text) ⊆ search(text-tokens)` invariant.** The conformance harness now asserts the parity for claude, codex, gemini, cursor CLI, and cursor IDE app. Each fixture carries a distinctive assistant string the test searches for; deliberately breaking any extractor flips the corresponding case red. Codex was the only adapter failing pre-fix; the invariant guards every adapter going forward. + +#### Doctor honesty + +- **N3 — `info` severity + hooks-path reconciliation.** `info` is a real fourth severity now (alongside `pass`/`warn`/`fail`). `integration_*`, `snippet_*`, and `setup_intents` emit `info` when the repo intentionally has not run `chorus setup` (detected via `INTENTS.md` or `providers/` presence — the bare `.agent-chorus/messages/` directory created on first `send` does not count). `context_pack_hooks_path` is now informational and reports the effective hooks path; `context_pack_pre_push` is the single authority on whether the hook is installed. +- **F2 — `env_override_dangling`.** A new warn-severity check enumerates `CHORUS_*_DIR` and `BRIDGE_*_DIR` env vars and flags any that point at non-existent directories. Surfaces the silent-partial-coverage failure mode where, for example, `CHORUS_CURSOR_DATA_DIR` left over from the v0.14.x bridge era would hide the CLI surface while the app surface appears healthy. +- **F3 — git-aware hooks checks.** `context_pack_hooks_path` and `context_pack_pre_push` now guard on `git rev-parse --git-dir` succeeding in the target cwd. On a non-git cwd they report `info: cwd is not a git repository` instead of inheriting the user's global `core.hooksPath` and claiming a hook is installed in a directory that has no `.git/`. +- **F12 — optional-adapter absence is `info`.** `sessions_cursor_cli`, `sessions_cursor_app`, and `sessions_hermes` report `info` when the data directory is absent (adapter not installed) and reserve `warn` for "directory exists but no sessions" (installed but quiet). Same pattern N3 used for integration/snippet/intents. +- **R2 — stale-snippet detection.** Users who ran `chorus setup` before v0.16.0 will not have the on-demand history contract in their provider snippets or managed blocks, and `chorus setup` without `--force` silently leaves them outdated. New `snippet__stale` and `integration__stale` checks probe for the load-bearing "History contract" phrase and emit `warn` with an explicit `Run \`chorus setup --force\` to refresh.` remediation when absent. + +#### Help & docs + +- **N4 — per-subcommand `--help` leads with that subcommand.** `chorus --help` now emits the subcommand's usage, options, and examples first; the global blob is reserved for top-level `chorus --help` / `chorus help`. `chorus report --help` carries the full handoff JSON schema with field annotations plus a copy-pasteable minimal example that loads without `INVALID_HANDOFF`. `chorus messages --help` documents `--clear` above the fold. `chorus doctor --help` documents the four severity levels and the overall-elevation rule from N3. +- **F5 — provider snippets carry the on-demand history contract.** The managed-block template (`CLAUDE.md` / `AGENTS.md` / `GEMINI.md`) and the provider snippet template (`.agent-chorus/providers/.md`) now spell out the `--history=on-demand` default, the 2.5× inflation finding, and the reserved-status of `--history=eager`. The managed block also gained the missing support-command list (`diff`, `audit-redactions`, `relevance`, `send`, `messages`) so a regenerated block does not lose richer hand-authored content. +- **F9 — Rust `--help` parity.** Rust's `report --help` and `doctor --help` now carry the same handoff schema and severity block Node does. Clap was previously collapsing the multiline doc comments; both subcommands now use `long_about` with explicit formatting. + +#### Hygiene + +- **F10 — `node:sqlite` experimental warning suppressed.** Every chorus invocation that loaded the cursor adapter previously emitted `(node:NNNNN) ExperimentalWarning: SQLite is an experimental feature ...` to stderr, corrupting naive stderr-capture tooling. A scoped warning listener at module load drops only the SQLite Experimental category and forwards everything else. +- **F11 — Node CLI rejects unknown flags.** A typo like `chorus list --Json` or `chorus read --limt 3` previously fell through silently and produced default-behavior output; Rust (clap) already failed closed. Node now drives a `validateFlags()` pass against an `ALLOWED_FLAGS` map per subcommand and errors with `Unknown flag for '': --. Run \`chorus --help\` to see allowed flags.` (R2 extended the allow-list to cover `trash-talk` and `agent-context`.) +- **F13 — `cli/src/cursor_app.rs` dead-code warnings cleared.** `CursorAppSession::{name, mode, created_at_ms}` and `find_session_db` are reserved for a future verbose-listing surface; tagged `#[allow(dead_code)]` with comments explaining the intent. +- **R2 — Rust `cwd_matches_project` matcher fix.** The hierarchical matcher used `Path::starts_with`, which is component-aware and treats root `/` as a prefix of every absolute path. A session whose recorded cwd was `/` would silently match every `--cwd ` and short-circuit the cwd_mismatch fallback in Rust+codex specifically. Aligned with Node's algorithm: string-based with an explicit trailing `/` separator. Any-path sessions no longer act as wildcards. +- **R2 — Rust setup `relative_path` symlink fix.** `chorus setup --cwd /tmp/...` on macOS wrote `Provider snippet: ../../../tmp/.../providers/.md` instead of the expected `.agent-chorus/providers/.md`. Root cause: `canonicalize()` resolved `/tmp` → `/private/tmp` for `base` but left `target` literal because the snippet file did not exist yet. New `canonicalize_via_parent()` walks up to the nearest existing ancestor and re-appends the missing tail so both paths share a canonical prefix. + +### Breaking changes + +- **`sessions_cursor` doctor check split.** Replaced by `sessions_cursor_cli` (cursor-agent CLI / JSONL surface) and `sessions_cursor_app` (Cursor IDE / SQLite surface). Consumers parsing `chorus doctor --json` by `id` need to update to read both. The combined "cursor adapter healthy" signal is now `sessions_cursor_cli.status == "pass" OR sessions_cursor_app.status == "pass"`. +- **`--tool-calls` on gemini/hermes emits a warning.** Previously silent no-op; now emits a uniform `NOT_AVAILABLE`-style warning in `warnings[]`. Consumers strictly comparing the warnings array to a baseline will see new entries on these adapters. +- **Rust `cwd_matches_project` no longer treats `cwd: "/"` as a wildcard.** A session whose recorded cwd was root would previously match every `--cwd` filter in the Rust runtime. Real bug fix — sessions recorded at the filesystem root are extraordinarily rare and almost certainly indicate a malformed record — but a consumer relying on the broken-matcher behavior will see different list/read results. Node was always correct here. +- **New `cwd_mismatch: true` field on `read` output.** Optional, only present on fallback. Consumers using strict-schema validators may need to allow the additional field; `schemas/read-output.schema.json` is updated. +- **History-contract preamble at top of managed block.** `chorus setup` (and `--force` refresh) now writes the History contract section above the routing bullets in `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`. Consumer agents reading the file encounter the on-demand rule before the command examples — placement reflects that violation is the most expensive failure mode in the field study. + +### Upgrade notes + +- Run `chorus setup --force` to refresh the provider snippets and managed blocks with the new on-demand history contract. Until you do, `chorus doctor` will surface `snippet__stale` / `integration__stale` warnings against the older content. +- Node 22.5+ recommended for full Cursor IDE app support (the adapter uses built-in `node:sqlite`). Older Node still works — the CLI/JSONL cursor surface remains fully accessible; only the SQLite surface is hidden, and doctor reports it as `info` (not `warn`) on older runtimes. +- If you have stale `BRIDGE_*` or `CHORUS_*_DIR` env vars left over from the v0.14.x bridge era pointing at directories that no longer exist, `chorus doctor` will now flag them via `env_override_dangling`. Clean them up — they have been silently hiding sessions until now. +- Consumers parsing `chorus doctor --json` by check `id`: replace `sessions_cursor` with the pair `sessions_cursor_cli` + `sessions_cursor_app`. + +### Acceptance + +- `cargo test --manifest-path cli/Cargo.toml` → **164 tests** pass. +- `scripts/conformance.sh` → all parity rows green, including the new cursor-app surface, the `read-cursor-app-redaction` SQLite-redaction case, the hermes `--tool-calls` no-op warning case, the gemini `--tool-calls` no-op warning case, and the `read ⊆ search` invariant for claude, codex, gemini, cursor CLI, and cursor IDE app. +- Live `chorus doctor` on this repo correctly fires `integration_claude_stale` against the hand-authored `CLAUDE.md` whose managed block predates v0.16.0 — verification that the stale-snippet detection works end-to-end on a real install. +- F1 cwd_mismatch fallback regression-tested across every adapter that has the fallback path. + +### Known limitations + +- **`--history=eager` is reserved.** Currently behaves identically to `on-demand` and emits a warning. The future multi-session merge surface lands later; do not depend on `eager` behavior in scripts. +- **Hermes fixture is synthetic.** The adapter still has no production transcripts in the wild; the new fixture (`fixtures/session-store/hermes/sessions/session-hermes-fixture.jsonl`) exercises the no-tool-calls warning path end-to-end but the format is still assumed (matches the v0.15.0 provisional state). +- **Gemini `cwd_mismatch` field never fires.** Gemini scopes by named project under `~/.gemini/tmp/`, not by absolute filesystem path, so the abspath-mismatch detection doesn't apply. UAT lesson carried forward into the future agy adapter, not retrofitted into legacy gemini. + +### Credits + +Driven by three local research documents (gitignored, see `docs/DEVELOPMENT.md` for the local-only policy): + +- `research/uat-cli-features-2026-06-03.md` — original UAT against the installed v0.15.0 binary; identified P1–P4 findings that became N1–N7. +- `research/next-scopes-post-v0.15.0-2026-06-03.md` — scoping that selected PRIO 1 (UAT gap close) as the headline lane for v0.16.0 and deferred the agent-context split (S-lane) to a later release. +- `research/uat-replay-followups-2026-06-03.md` — independent fresh-context UAT replay against the in-flight branch; surfaced the Tier-1/2/3 follow-ups (F1–F13) plus the R2-round defects. + ## v0.15.0 — 2026-06-03 **Native Cursor adapter (Rust + Node) — Cursor is now a first-class agent — plus a provisional Hermes adapter and agent-context hardening.** diff --git a/cli/Cargo.lock b/cli/Cargo.lock index 84cd724..b2eb01f 100644 --- a/cli/Cargo.lock +++ b/cli/Cargo.lock @@ -10,18 +10,31 @@ checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa" [[package]] name = "agent-chorus" -version = "0.15.0" +version = "0.16.0" dependencies = [ "anyhow", "clap", "dirs", "globset", + "rusqlite", "serde", "serde_json", "sha2", "ureq", ] +[[package]] +name = "ahash" +version = "0.8.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5a15f179cd60c4584b8a8c596927aadc462e27f2ca70c04e0071964a73ba7a75" +dependencies = [ + "cfg-if", + "once_cell", + "version_check", + "zerocopy", +] + [[package]] name = "aho-corasick" version = "1.1.4" @@ -93,6 +106,12 @@ version = "0.22.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "72b3254f16251a8381aa12e40e3c4d2f0199f8c6508fbecb9d91f575e0fbb8c6" +[[package]] +name = "bitflags" +version = "2.12.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "84d7ced0ae9557296835c32bf1b1e02b44c746701f898460fb000d7eaa84f00a" + [[package]] name = "block-buffer" version = "0.10.4" @@ -250,6 +269,18 @@ version = "1.0.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f" +[[package]] +name = "fallible-iterator" +version = "0.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2acce4a10f12dc2fb14a218589d4f1f62ef011b2d0cc4b3cb1bba8e94da14649" + +[[package]] +name = "fallible-streaming-iterator" +version = "0.1.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7360491ce676a36bf9bb3c56c1aa791658183a54d2744120f27285738d90465a" + [[package]] name = "find-msvc-tools" version = "0.1.9" @@ -309,12 +340,30 @@ dependencies = [ "regex-syntax", ] +[[package]] +name = "hashbrown" +version = "0.14.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e5274423e17b7c9fc20b6e7e208532f9b19825d82dfd615708b70edd83df41f1" +dependencies = [ + "ahash", +] + [[package]] name = "hashbrown" version = "0.16.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100" +[[package]] +name = "hashlink" +version = "0.9.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6ba4ff7128dee98c7dc9794b6a411377e1404dba1c97deb8d1a55297bd25d8af" +dependencies = [ + "hashbrown 0.14.5", +] + [[package]] name = "heck" version = "0.5.0" @@ -430,7 +479,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7714e70437a7dc3ac8eb7e6f8df75fd8eb422675fc7678aff7364301092b1017" dependencies = [ "equivalent", - "hashbrown", + "hashbrown 0.16.1", ] [[package]] @@ -460,6 +509,17 @@ dependencies = [ "libc", ] +[[package]] +name = "libsqlite3-sys" +version = "0.28.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0c10584274047cb335c23d3e61bcef8e323adae7c5c8c760540f73610177fc3f" +dependencies = [ + "cc", + "pkg-config", + "vcpkg", +] + [[package]] name = "litemap" version = "0.8.1" @@ -512,6 +572,12 @@ version = "2.3.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9b4f627cb1b25917193a259e49bdad08f671f8d9708acfd5fe0a8c1455d87220" +[[package]] +name = "pkg-config" +version = "0.3.33" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "19f132c84eca552bf34cab8ec81f1c1dcc229b811638f9d283dceabe58c5569e" + [[package]] name = "potential_utf" version = "0.1.4" @@ -581,6 +647,20 @@ dependencies = [ "windows-sys 0.52.0", ] +[[package]] +name = "rusqlite" +version = "0.31.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b838eba278d213a8beaf485bd313fd580ca4505a00d5871caeb1457c55322cae" +dependencies = [ + "bitflags", + "fallible-iterator", + "fallible-streaming-iterator", + "hashlink", + "libsqlite3-sys", + "smallvec", +] + [[package]] name = "rustls" version = "0.23.37" @@ -819,6 +899,12 @@ version = "0.2.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821" +[[package]] +name = "vcpkg" +version = "0.2.15" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "accd4ea62f7bb7a82fe23066fb0957d48ef677f6eeb8215f372f52e48bb32426" + [[package]] name = "version_check" version = "0.9.5" @@ -966,6 +1052,26 @@ dependencies = [ "synstructure", ] +[[package]] +name = "zerocopy" +version = "0.8.50" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3b065d4f0e55f82fae73202e189638116a87c55ab6b8e6c2721e13dd9d854ad1" +dependencies = [ + "zerocopy-derive", +] + +[[package]] +name = "zerocopy-derive" +version = "0.8.50" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0b631b19d36a892ab55420c92dbc83ccd79274f25be714855d3074aa71cab639" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + [[package]] name = "zerofrom" version = "0.1.6" diff --git a/cli/Cargo.toml b/cli/Cargo.toml index 429a8bf..7eb77f8 100644 --- a/cli/Cargo.toml +++ b/cli/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "agent-chorus" -version = "0.15.0" +version = "0.16.0" edition = "2021" rust-version = "1.74" description = "Local-first CLI to read, compare, and hand off context across Codex, Claude, Gemini, and Cursor sessions." @@ -26,6 +26,11 @@ anyhow = "1.0.101" clap = { version = "4.5.57", features = ["derive"] } dirs = "6.0.0" globset = "0.4.14" +# `bundled` ships SQLite inside the crate so the chorus binary has no host +# SQLite dependency. Required by the Cursor IDE adapter (`~/.cursor/chats/ +# //store.db`) introduced in v0.16.0 — see N1 in +# research/next-scopes-post-v0.15.0-2026-06-03.md. +rusqlite = { version = "0.31", features = ["bundled"] } serde = { version = "1.0", features = ["derive"] } serde_json = { version = "1.0.149", features = ["preserve_order"] } sha2 = "0.10.9" diff --git a/cli/src/agents.rs b/cli/src/agents.rs index 51cc855..7a9e291 100644 --- a/cli/src/agents.rs +++ b/cli/src/agents.rs @@ -65,6 +65,12 @@ pub struct Session { pub timestamp: Option, pub message_count: usize, pub messages_returned: usize, + /// F1: true when the caller passed `--cwd ` but no session matched + /// that cwd, and the adapter fell back to the latest session. The + /// fallback is intentional (a warning is also pushed to `warnings`) + /// but consumers that parse the JSON without scanning the warnings + /// array can use this boolean to detect the silent-fallback case. + pub cwd_mismatch: bool, } #[derive(Clone)] @@ -97,6 +103,7 @@ pub fn read_codex_session_with_options( } let mut warnings = Vec::new(); + let mut cwd_mismatch = false; let target_file = if let Some(id_value) = id { let files = collect_matching_files(&base_dir, true, &|file_path| { has_extension(file_path, "jsonl") && path_contains(file_path, id_value) @@ -119,6 +126,7 @@ pub fn read_codex_session_with_options( "Warning: no Codex session matched cwd {}; falling back to latest session.", expected_cwd.display() )); + cwd_mismatch = true; files[0].path.clone() } }; @@ -136,6 +144,7 @@ pub fn read_codex_session_with_options( timestamp: parsed.timestamp, message_count: parsed.message_count, messages_returned: parsed.messages_returned, + cwd_mismatch, }) } @@ -163,6 +172,7 @@ pub fn read_claude_session_with_options( } let mut warnings = Vec::new(); + let mut cwd_mismatch = false; let target_file = if let Some(id_value) = id { let files = collect_matching_files(&base_dir, true, &|file_path| { has_extension(file_path, "jsonl") && path_contains(file_path, id_value) @@ -185,6 +195,7 @@ pub fn read_claude_session_with_options( "Warning: no Claude session matched cwd {}; falling back to latest session.", expected_cwd.display() )); + cwd_mismatch = true; files[0].path.clone() } }; @@ -202,6 +213,7 @@ pub fn read_claude_session_with_options( timestamp: parsed.timestamp, message_count: parsed.message_count, messages_returned: parsed.messages_returned, + cwd_mismatch, }) } @@ -291,6 +303,10 @@ pub fn read_gemini_session_with_options( timestamp: parsed.timestamp, message_count: parsed.message_count, messages_returned: parsed.messages_returned, + // gemini scopes by project nickname rather than absolute cwd; the + // cwd_mismatch concept (cwd given but no session matched) does + // not apply here. + cwd_mismatch: false, }) } @@ -1085,7 +1101,7 @@ pub(crate) fn extract_text_with_tool_calls(value: &Value) -> String { /// A single turn in a reconstructed conversation, used by `--include-user` /// interleaving. Role is always "user" or "assistant". #[derive(Debug, Clone)] -pub(crate) struct ConversationTurn { +pub struct ConversationTurn { pub role: String, pub text: String, } @@ -1136,7 +1152,7 @@ pub(crate) fn select_conversation_turns( selected } -fn file_modified_iso(path: &Path) -> Option { +pub fn file_modified_iso(path: &Path) -> Option { fs::metadata(path) .ok() .and_then(|m| m.modified().ok()) @@ -1207,10 +1223,40 @@ fn extract_assistant_text_jsonl(path: &Path, agent: &str) -> String { }; match agent { "codex" => { - if json.get("role").and_then(|v| v.as_str()) == Some("assistant") { - if let Some(content) = json.get("content").and_then(|v| v.as_str()) { - text.push_str(content); - text.push('\n'); + // Codex stores messages in two nested envelopes: + // - {type:"response_item", payload:{type:"message", role:"assistant", + // content:[{text:"..."}, ...]}} + // - {type:"event_msg", payload:{type:"agent_message", message:"..."}} + // The previous shape (role/content at top level) never existed in any + // real codex session and produced an unconditionally empty result + // for search — that was UAT P3. + let envelope_type = json.get("type").and_then(|v| v.as_str()).unwrap_or(""); + let payload = json.get("payload"); + if envelope_type == "response_item" { + if let Some(p) = payload { + if p.get("type").and_then(|v| v.as_str()) == Some("message") + && p.get("role").and_then(|v| v.as_str()) == Some("assistant") + { + let t = extract_text(&p["content"]); + if !t.is_empty() { + text.push_str(&t); + text.push('\n'); + } + } + } + } else if envelope_type == "event_msg" { + if let Some(p) = payload { + if p.get("type").and_then(|v| v.as_str()) == Some("agent_message") { + let t = if let Some(s) = p.get("message").and_then(|v| v.as_str()) { + s.to_string() + } else { + extract_text(&p["message"]) + }; + if !t.is_empty() { + text.push_str(&t); + text.push('\n'); + } + } } } } @@ -1320,10 +1366,24 @@ fn compute_match_snippet(text: &str, query: &str) -> Option { } /// Hierarchical CWD matching: exact match, ancestor, or descendant. +/// +/// Uses string-based prefix with an explicit trailing separator to match +/// Node's `cwdMatchesProject` (scripts/adapters/utils.cjs). The earlier +/// version used `Path::starts_with`, which is component-wise and treats +/// root `/` as a prefix of every absolute path — meaning a session whose +/// recorded cwd was just `/` would silently match every `--cwd ` +/// request and short-circuit the cwd-mismatch fallback. The trailing-`/` +/// rule rejects that case cleanly while still permitting genuine +/// ancestor/descendant relationships in a project tree. fn cwd_matches_project(session_cwd: &Path, expected_cwd: &Path) -> bool { - session_cwd == expected_cwd - || expected_cwd.starts_with(session_cwd) - || session_cwd.starts_with(expected_cwd) + let a = session_cwd.to_string_lossy(); + let b = expected_cwd.to_string_lossy(); + if a == b { + return true; + } + let a_pref = format!("{}/", a); + let b_pref = format!("{}/", b); + b.starts_with(&a_pref) || a.starts_with(&b_pref) } fn find_latest_by_cwd( @@ -2324,56 +2384,129 @@ pub fn search_gemini_sessions(query: &str, cwd: Option<&str>, limit: usize) -> R } pub fn search_cursor_sessions(query: &str, cwd: Option<&str>, limit: usize) -> Result> { - let base_dir = cursor_base_dir(); - if !base_dir.exists() { return Ok(Vec::new()); } - - let files = collect_cursor_transcripts(&base_dir, None)?; - let query_lower = query.to_ascii_lowercase(); let expected_cwd = cwd.map(normalize_path).transpose()?; - let mut entries = Vec::new(); + let mut entries: Vec = Vec::new(); - for file in files { - if entries.len() >= limit { break; } + // Surface 1: cursor-agent CLI JSONL transcripts. + let base_dir = cursor_base_dir(); + if base_dir.exists() { + let files = collect_cursor_transcripts(&base_dir, None)?; + for file in files { + if fs::metadata(&file.path).map(|m| m.len() > MAX_FILE_SIZE).unwrap_or(false) { + continue; + } + let session_cwd = get_cursor_session_cwd(&file.path); + if let Some(expected) = expected_cwd.as_ref() { + match session_cwd.as_ref() { + Some(sc) if cwd_matches_project(sc, expected) => {} + _ => continue, + } + } + let assistant_text = crate::cursor_parse::read_cursor_turns(&file.path) + .into_iter() + .filter(|t| t.role == "assistant") + .map(|t| t.text) + .collect::>() + .join("\n"); + if assistant_text.to_ascii_lowercase().contains(&query_lower) { + let snippet = compute_match_snippet(&assistant_text, query); + let session_id = file.path.file_stem().and_then(|s| s.to_str()).unwrap_or("unknown").to_string(); + entries.push(serde_json::json!({ + "session_id": session_id, + "agent": "cursor", + "source": "cli", + "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), + "modified_at": file_modified_iso(&file.path), + "file_path": file.path.to_string_lossy().to_string(), + "match_snippet": snippet, + })); + } + } + } - if fs::metadata(&file.path).map(|m| m.len() > MAX_FILE_SIZE).unwrap_or(false) { - continue; + // Surface 2: Cursor IDE store.db sessions. + let app_base = crate::cursor_app::cursor_app_base_dir(); + if app_base.exists() { + let sessions = crate::cursor_app::collect_cursor_app_sessions(&app_base); + for s in sessions { + let session_cwd = crate::cursor_app::cursor_app_session_workspace(&s.db_path); + if let Some(expected) = expected_cwd.as_ref() { + match session_cwd.as_ref() { + Some(sc) if cwd_matches_project(sc, expected) => {} + _ => continue, + } + } + let assistant_text = crate::cursor_app::read_cursor_app_turns(&s.db_path, false) + .into_iter() + .filter(|t| t.role == "assistant") + .map(|t| t.text) + .collect::>() + .join("\n"); + if assistant_text.to_ascii_lowercase().contains(&query_lower) { + let snippet = compute_match_snippet(&assistant_text, query); + entries.push(serde_json::json!({ + "session_id": s.agent_id.clone(), + "agent": "cursor", + "source": "app", + "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), + "modified_at": crate::cursor_app::cursor_app_modified_iso(&s.db_path), + "file_path": s.db_path.to_string_lossy().to_string(), + "match_snippet": snippet, + })); + } } + } + + // Newest-first across both surfaces, then truncate to limit. + entries.sort_by(|a, b| { + let am = a.get("modified_at").and_then(|v| v.as_str()).unwrap_or(""); + let bm = b.get("modified_at").and_then(|v| v.as_str()).unwrap_or(""); + bm.cmp(am) + }); + entries.truncate(limit); + Ok(entries) +} + +// --- Cursor support --- - // Project-scope by the session's real cwd (derived from .workspace-trusted - // or filesystem-validated demangling), matching codex/claude semantics. +pub fn cursor_base_dir_public() -> PathBuf { + cursor_base_dir() +} + +/// Count cursor-agent CLI transcripts matching `cwd` (capped to keep +/// doctor cheap). Used by the per-surface doctor split. +pub fn hermes_base_dir_public() -> PathBuf { + hermes_base_dir() +} + +pub fn list_cursor_cli_sessions_count(cwd: Option<&str>, limit: usize) -> usize { + let base_dir = cursor_base_dir(); + if !base_dir.exists() { + return 0; + } + let files = match collect_cursor_transcripts(&base_dir, None) { + Ok(f) => f, + Err(_) => return 0, + }; + let expected = cwd.and_then(|c| normalize_path(c).ok()); + let mut n = 0usize; + for file in files { let session_cwd = get_cursor_session_cwd(&file.path); - if let Some(expected) = expected_cwd.as_ref() { + if let Some(exp) = expected.as_ref() { match session_cwd.as_ref() { - Some(sc) if cwd_matches_project(sc, expected) => {} + Some(sc) if cwd_matches_project(sc, exp) => {} _ => continue, } } - - let assistant_text = crate::cursor_parse::read_cursor_turns(&file.path) - .into_iter() - .filter(|t| t.role == "assistant") - .map(|t| t.text) - .collect::>() - .join("\n"); - if assistant_text.to_ascii_lowercase().contains(&query_lower) { - let snippet = compute_match_snippet(&assistant_text, query); - let session_id = file.path.file_stem().and_then(|s| s.to_str()).unwrap_or("unknown").to_string(); - entries.push(serde_json::json!({ - "session_id": session_id, - "agent": "cursor", - "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), - "modified_at": file_modified_iso(&file.path), - "file_path": file.path.to_string_lossy().to_string(), - "match_snippet": snippet, - })); + n += 1; + if n >= limit { + break; } } - Ok(entries) + n } -// --- Cursor support --- - fn cursor_base_dir() -> PathBuf { std::env::var("CHORUS_CURSOR_DATA_DIR") .or_else(|_| std::env::var("BRIDGE_CURSOR_DATA_DIR")) @@ -2402,43 +2535,109 @@ pub fn read_cursor_session_with_options( opts: ReadOptions, ) -> Result { let base_dir = cursor_base_dir(); + let app_base = crate::cursor_app::cursor_app_base_dir(); + if is_system_directory(&base_dir) { return Err(anyhow!("Refusing to scan system directory: {}", base_dir.display())); } - if !base_dir.exists() { - return Err(anyhow!(cursor_not_found_message(&format!("No Cursor session found. Data directory not found: {}", base_dir.display())))); + + // Read sees both surfaces (cursor-agent CLI JSONL + Cursor IDE store.db). + // We assemble a unified candidate list, then choose the target by id + // (if given) or by latest-mtime-matching-cwd (mirroring codex/claude). + enum Candidate { + Cli(FileEntry), + App(crate::cursor_app::CursorAppSession), } - let files = collect_cursor_transcripts(&base_dir, id)?; - if files.is_empty() { + let mut candidates: Vec = Vec::new(); + if base_dir.exists() { + let files = collect_cursor_transcripts(&base_dir, id)?; + for f in files { + candidates.push(Candidate::Cli(f)); + } + } + if app_base.exists() { + let sessions = crate::cursor_app::collect_cursor_app_sessions(&app_base); + for s in sessions { + if let Some(needle) = id { + if !s.agent_id.contains(needle) + && !s.db_path.to_string_lossy().contains(needle) + { + continue; + } + } + candidates.push(Candidate::App(s)); + } + } + if candidates.is_empty() { return Err(anyhow!(cursor_not_found_message("No Cursor session found."))); } - // Scope to --cwd when no explicit id was given (mirror codex/claude). The - // session's real cwd comes from .workspace-trusted or a filesystem-validated - // demangle of the project dir name (see crate::cursor_cwd). + // Newest-first by mtime across both surfaces. + candidates.sort_by(|a, b| { + let am = match a { + Candidate::Cli(f) => file_modified_iso(&f.path), + Candidate::App(s) => file_modified_iso(&s.db_path), + }; + let bm = match b { + Candidate::Cli(f) => file_modified_iso(&f.path), + Candidate::App(s) => file_modified_iso(&s.db_path), + }; + bm.unwrap_or_default().cmp(&am.unwrap_or_default()) + }); + + let resolve_cwd = |c: &Candidate| -> Option { + match c { + Candidate::Cli(f) => get_cursor_session_cwd(&f.path), + Candidate::App(s) => crate::cursor_app::cursor_app_session_workspace(&s.db_path), + } + }; + let mut warnings: Vec = Vec::new(); - let target_file = if id.is_some() { - files[0].path.clone() + let mut cwd_mismatch = false; + let target: &Candidate = if id.is_some() { + &candidates[0] } else { match normalize_path(cwd) { - Ok(expected) => match find_latest_by_cwd(&files, &expected, get_cursor_session_cwd) { - Some(p) => p, - None => { - warnings.push(format!( - "No Cursor session matched cwd {}; falling back to latest session.", - expected.display() - )); - files[0].path.clone() + Ok(expected) => { + let matched = candidates.iter().find(|c| { + resolve_cwd(c) + .map(|sc| cwd_matches_project(&sc, &expected)) + .unwrap_or(false) + }); + match matched { + Some(t) => t, + None => { + warnings.push(format!( + "No Cursor session matched cwd {}; falling back to latest session.", + expected.display() + )); + cwd_mismatch = true; + &candidates[0] + } } - }, - Err(_) => files[0].path.clone(), + } + Err(_) => &candidates[0], } }; - // Parse the cursor-agent transcript into ordered turns. Text-only by default; - // with --tool-calls, tool_use/tool_result segments are rendered too. - let turns: Vec = cursor_turns_for_read(&target_file, opts.include_tool_calls); + let (turns, source_path, session_id, timestamp, cwd_out) = match target { + Candidate::Cli(f) => { + let t = cursor_turns_for_read(&f.path, opts.include_tool_calls); + let sid = f.path.file_stem().and_then(|s| s.to_str()).map(|s| s.to_string()); + let ts = file_modified_iso(&f.path); + let cwd_o = get_cursor_session_cwd(&f.path).map(|p| p.to_string_lossy().to_string()); + (t, f.path.to_string_lossy().to_string(), sid, ts, cwd_o) + } + Candidate::App(s) => { + let t = crate::cursor_app::read_cursor_app_turns(&s.db_path, opts.include_tool_calls); + let sid = Some(s.agent_id.clone()); + let ts = crate::cursor_app::cursor_app_modified_iso(&s.db_path); + let cwd_o = crate::cursor_app::cursor_app_session_workspace(&s.db_path) + .map(|p| p.to_string_lossy().to_string()); + (t, s.db_path.to_string_lossy().to_string(), sid, ts, cwd_o) + } + }; let assistant_msgs: Vec = turns .iter() @@ -2467,20 +2666,17 @@ pub fn read_cursor_session_with_options( ("[No assistant messages found]".to_string(), 0) }; - let session_id = target_file.file_stem().and_then(|s| s.to_str()).map(|s| s.to_string()); - let timestamp = file_modified_iso(&target_file); - let cwd_out = get_cursor_session_cwd(&target_file).map(|p| p.to_string_lossy().to_string()); - Ok(Session { agent: "cursor", content: redact_sensitive_text(&content), - source: target_file.to_string_lossy().to_string(), + source: source_path, warnings, session_id, cwd: cwd_out, timestamp, message_count, messages_returned, + cwd_mismatch, }) } @@ -2541,35 +2737,63 @@ fn cursor_turns_for_read(path: &Path, include_tool_calls: bool) -> Vec, limit: usize) -> Result> { - let base_dir = cursor_base_dir(); - if !base_dir.exists() { return Ok(Vec::new()); } - - let files = collect_cursor_transcripts(&base_dir, None)?; let expected_cwd = cwd.map(normalize_path).transpose()?; + let mut entries: Vec = Vec::new(); - let mut entries = Vec::new(); - for file in files { - // Derive the session's real cwd and project-scope on it (codex/claude parity). - let session_cwd = get_cursor_session_cwd(&file.path); - if let Some(expected) = expected_cwd.as_ref() { - match session_cwd.as_ref() { - Some(sc) if cwd_matches_project(sc, expected) => {} - _ => continue, + // Surface 1: cursor-agent CLI JSONL transcripts under ~/.cursor/projects. + let base_dir = cursor_base_dir(); + if base_dir.exists() { + let files = collect_cursor_transcripts(&base_dir, None)?; + for file in files { + let session_cwd = get_cursor_session_cwd(&file.path); + if let Some(expected) = expected_cwd.as_ref() { + match session_cwd.as_ref() { + Some(sc) if cwd_matches_project(sc, expected) => {} + _ => continue, + } } + let session_id = file.path.file_stem().and_then(|s| s.to_str()).unwrap_or("unknown").to_string(); + entries.push(serde_json::json!({ + "session_id": session_id, + "agent": "cursor", + "source": "cli", + "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), + "modified_at": file_modified_iso(&file.path), + "file_path": file.path.to_string_lossy().to_string(), + })); } + } - let session_id = file.path.file_stem().and_then(|s| s.to_str()).unwrap_or("unknown").to_string(); - entries.push(serde_json::json!({ - "session_id": session_id, - "agent": "cursor", - "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), - "modified_at": file_modified_iso(&file.path), - "file_path": file.path.to_string_lossy().to_string(), - })); - if entries.len() >= limit { - break; + // Surface 2: Cursor IDE store.db sessions under ~/.cursor/chats. + let app_base = crate::cursor_app::cursor_app_base_dir(); + if app_base.exists() { + let app_sessions = crate::cursor_app::collect_cursor_app_sessions(&app_base); + for session in app_sessions { + let session_cwd = crate::cursor_app::cursor_app_session_workspace(&session.db_path); + if let Some(expected) = expected_cwd.as_ref() { + match session_cwd.as_ref() { + Some(sc) if cwd_matches_project(sc, expected) => {} + _ => continue, + } + } + entries.push(serde_json::json!({ + "session_id": session.agent_id.clone(), + "agent": "cursor", + "source": "app", + "cwd": session_cwd.map(|p| p.to_string_lossy().to_string()), + "modified_at": crate::cursor_app::cursor_app_modified_iso(&session.db_path), + "file_path": session.db_path.to_string_lossy().to_string(), + })); } } + + // Newest-first across both surfaces, then truncate. + entries.sort_by(|a, b| { + let am = a.get("modified_at").and_then(|v| v.as_str()).unwrap_or(""); + let bm = b.get("modified_at").and_then(|v| v.as_str()).unwrap_or(""); + bm.cmp(am) + }); + entries.truncate(limit); Ok(entries) } @@ -2647,6 +2871,7 @@ pub fn read_hermes_session_with_options( } let mut warnings: Vec = Vec::new(); + let mut cwd_mismatch = false; let target_file = if id.is_some() { files[0].path.clone() } else { @@ -2658,6 +2883,7 @@ pub fn read_hermes_session_with_options( "No Hermes session matched cwd {}; falling back to latest session.", expected.display() )); + cwd_mismatch = true; files[0].path.clone() } }, @@ -2724,6 +2950,7 @@ pub fn read_hermes_session_with_options( timestamp: file_modified_iso(&target_file), message_count, messages_returned, + cwd_mismatch, }) } diff --git a/cli/src/cursor_app.rs b/cli/src/cursor_app.rs new file mode 100644 index 0000000..34e90d3 --- /dev/null +++ b/cli/src/cursor_app.rs @@ -0,0 +1,421 @@ +//! Cursor IDE (app) adapter — reads sessions stored as SQLite databases. +//! +//! v0.15.0 shipped the JSONL adapter for the `cursor-agent` CLI +//! (`~/.cursor/projects//agent-transcripts//*.jsonl`). +//! The Cursor IDE itself writes sessions to a different location with a +//! different format: +//! +//! `~/.cursor/chats///store.db` (SQLite) +//! +//! Each `store.db` contains two tables: +//! - `meta(key TEXT PRIMARY KEY, value TEXT)`: one row whose `value` is +//! hex-encoded JSON with `agentId`, `latestRootBlobId`, `name`, `mode`, +//! `createdAt` (epoch ms). +//! - `blobs(id TEXT PRIMARY KEY, data BLOB)`: content-addressed by SHA-256. +//! The root blob (`latestRootBlobId`) is a protobuf-style envelope with +//! one or more repeated `bytes` fields whose values are the SHA-256 ids +//! of the message blobs, in order. Each message blob is JSON of the +//! shape `{"role": "user|assistant|system", "content": "..."|[...]}` — +//! identical to Claude's message shape, so the existing claude content +//! extractor renders tool_use / tool_result segments at parity. +//! +//! The workspace cwd for an IDE session is recovered from a header line +//! embedded in the first user-role message (`Workspace Path: `). +//! Future-proof: if that line is missing we return `None` and the adapter +//! falls back to the latest-session-without-cwd-match warning path +//! mirroring the JSONL adapter. + +use anyhow::{anyhow, Context, Result}; +use rusqlite::{Connection, OpenFlags}; +use serde_json::Value; +use std::path::{Path, PathBuf}; +use std::time::SystemTime; + +use crate::agents::ConversationTurn; +use crate::agents::extract_claude_content_with_tool_calls; + +/// One Cursor IDE session as enumerated from the chats root. +/// +/// `name`, `mode`, `created_at_ms` are collected from the `meta` table +/// but not yet surfaced in any chorus output. They're reserved for a +/// future `chorus list --verbose` (or similar) that exposes IDE-side +/// metadata; collecting them now means no schema change is required +/// when that lands. +#[derive(Debug, Clone)] +pub struct CursorAppSession { + pub agent_id: String, + pub db_path: PathBuf, + #[allow(dead_code)] + pub name: Option, + #[allow(dead_code)] + pub mode: Option, + #[allow(dead_code)] + pub created_at_ms: Option, +} + +/// `~/.cursor/chats` by default; override via `CHORUS_CURSOR_APP_DATA_DIR` +/// or `BRIDGE_CURSOR_APP_DATA_DIR` (the bridge fallback is preserved for +/// backward compatibility with the legacy environment variable convention). +pub fn cursor_app_base_dir() -> PathBuf { + if let Ok(v) = std::env::var("CHORUS_CURSOR_APP_DATA_DIR") { + return expand_home(&v); + } + if let Ok(v) = std::env::var("BRIDGE_CURSOR_APP_DATA_DIR") { + return expand_home(&v); + } + dirs::home_dir() + .map(|h| h.join(".cursor").join("chats")) + .unwrap_or_else(|| PathBuf::from("~/.cursor/chats")) +} + +fn expand_home(p: &str) -> PathBuf { + if let Some(stripped) = p.strip_prefix("~/") { + if let Some(home) = dirs::home_dir() { + return home.join(stripped); + } + } + PathBuf::from(p) +} + +/// Walk `///store.db` and return one entry +/// per discoverable session, newest mtime first. +pub fn collect_cursor_app_sessions(base: &Path) -> Vec { + let mut out = Vec::new(); + let hash_iter = match std::fs::read_dir(base) { + Ok(it) => it, + Err(_) => return out, + }; + for hash_entry in hash_iter.flatten() { + let hash_dir = hash_entry.path(); + if !hash_dir.is_dir() { + continue; + } + let uuid_iter = match std::fs::read_dir(&hash_dir) { + Ok(it) => it, + Err(_) => continue, + }; + for uuid_entry in uuid_iter.flatten() { + let uuid_dir = uuid_entry.path(); + let db_path = uuid_dir.join("store.db"); + if !db_path.is_file() { + continue; + } + if let Some(session) = read_session_meta(&db_path) { + out.push(session); + } + } + } + out.sort_by(|a, b| { + let am = mtime_secs(&a.db_path); + let bm = mtime_secs(&b.db_path); + bm.cmp(&am) + }); + out +} + +fn mtime_secs(p: &Path) -> u64 { + std::fs::metadata(p) + .and_then(|m| m.modified()) + .map(|t| t.duration_since(SystemTime::UNIX_EPOCH).map(|d| d.as_secs()).unwrap_or(0)) + .unwrap_or(0) +} + +fn open_ro(db_path: &Path) -> Result { + Connection::open_with_flags( + db_path, + OpenFlags::SQLITE_OPEN_READ_ONLY | OpenFlags::SQLITE_OPEN_URI, + ) + .with_context(|| format!("opening Cursor IDE store.db: {}", db_path.display())) +} + +fn read_session_meta(db_path: &Path) -> Option { + let conn = open_ro(db_path).ok()?; + let value: String = conn + .query_row("SELECT value FROM meta LIMIT 1", [], |row| row.get(0)) + .ok()?; + let bytes = hex_decode(&value).ok()?; + let json: Value = serde_json::from_slice(&bytes).ok()?; + let agent_id = json.get("agentId").and_then(|v| v.as_str())?.to_string(); + Some(CursorAppSession { + agent_id, + db_path: db_path.to_path_buf(), + name: json.get("name").and_then(|v| v.as_str()).map(|s| s.to_string()), + mode: json.get("mode").and_then(|v| v.as_str()).map(|s| s.to_string()), + created_at_ms: json.get("createdAt").and_then(|v| v.as_i64()), + }) +} + +/// Decode a lowercase hex string. Mirrors Node's `Buffer.from(hex, 'hex')`. +fn hex_decode(s: &str) -> Result> { + if s.len() % 2 != 0 { + return Err(anyhow!("hex string length is odd")); + } + let mut out = Vec::with_capacity(s.len() / 2); + for i in (0..s.len()).step_by(2) { + let byte = u8::from_str_radix(&s[i..i + 2], 16) + .map_err(|_| anyhow!("invalid hex at position {}", i))?; + out.push(byte); + } + Ok(out) +} + +/// Read the ordered list of message-blob SHAs from the root blob. +/// +/// The root blob is a protobuf-style stream of `(tag, length, payload)` +/// triples; we walk it greedily and accept any length-delimited (wire +/// type 2) field whose payload is exactly 32 bytes — that's the SHA-256 +/// of a child blob. Other tags / payload sizes are skipped over without +/// failing, which keeps us forward-compatible with new fields the IDE +/// adds (we observed tag 0x2a appearing after the main chain in some +/// sessions; it does not point at message blobs). +fn parse_root_blob_chain(data: &[u8]) -> Vec { + let mut out = Vec::new(); + let mut i = 0usize; + while i < data.len() { + // Read varint tag. + let (tag, tag_len) = match read_varint(&data[i..]) { + Some(v) => v, + None => break, + }; + i += tag_len; + let wire_type = (tag & 0x07) as u8; + match wire_type { + 2 => { + // length-delimited + let (len, len_len) = match read_varint(&data[i..]) { + Some(v) => v, + None => break, + }; + i += len_len; + let payload_len = len as usize; + if i + payload_len > data.len() { + break; + } + if payload_len == 32 { + let hash = hex_encode(&data[i..i + payload_len]); + out.push(hash); + } + i += payload_len; + } + 0 => { + // varint + match read_varint(&data[i..]) { + Some((_, n)) => i += n, + None => break, + } + } + 1 => i += 8, // 64-bit fixed + 5 => i += 4, // 32-bit fixed + _ => break, // unknown wire type + } + } + out +} + +fn read_varint(b: &[u8]) -> Option<(u64, usize)> { + let mut result: u64 = 0; + let mut shift = 0u32; + for (i, byte) in b.iter().enumerate() { + if i >= 10 { + return None; + } + result |= ((byte & 0x7f) as u64) << shift; + if byte & 0x80 == 0 { + return Some((result, i + 1)); + } + shift += 7; + } + None +} + +fn hex_encode(b: &[u8]) -> String { + let mut s = String::with_capacity(b.len() * 2); + for byte in b { + s.push_str(&format!("{:02x}", byte)); + } + s +} + +/// Read all conversation turns from a Cursor IDE store.db, in order. +/// +/// When `include_tool_calls` is false we return only the text segments of +/// each message; when true, tool_use / tool_result segments are rendered +/// via the shared claude extractor (cursor's content shape matches claude). +pub fn read_cursor_app_turns(db_path: &Path, include_tool_calls: bool) -> Vec { + let conn = match open_ro(db_path) { + Ok(c) => c, + Err(_) => return Vec::new(), + }; + let meta_value: String = match conn.query_row("SELECT value FROM meta LIMIT 1", [], |row| row.get(0)) { + Ok(v) => v, + Err(_) => return Vec::new(), + }; + let meta_bytes = match hex_decode(&meta_value) { + Ok(b) => b, + Err(_) => return Vec::new(), + }; + let meta_json: Value = match serde_json::from_slice(&meta_bytes) { + Ok(v) => v, + Err(_) => return Vec::new(), + }; + let root_id = match meta_json.get("latestRootBlobId").and_then(|v| v.as_str()) { + Some(s) => s.to_string(), + None => return Vec::new(), + }; + + let root_blob: Vec = match conn.query_row("SELECT data FROM blobs WHERE id = ?", [&root_id], |row| row.get(0)) { + Ok(b) => b, + Err(_) => return Vec::new(), + }; + let child_ids = parse_root_blob_chain(&root_blob); + + let mut turns = Vec::new(); + for child_id in child_ids { + let data: Vec = match conn.query_row("SELECT data FROM blobs WHERE id = ?", [&child_id], |row| row.get(0)) { + Ok(b) => b, + Err(_) => continue, + }; + let v: Value = match serde_json::from_slice(&data) { + Ok(v) => v, + Err(_) => continue, + }; + let role = match v.get("role").and_then(|r| r.as_str()) { + Some(r) if r == "user" || r == "assistant" => r.to_string(), + _ => continue, + }; + let content = v.get("content").cloned().unwrap_or(Value::Null); + let text = if include_tool_calls { + extract_claude_content_with_tool_calls(&content) + } else { + extract_text_only(&content) + }; + let text = text.trim().to_string(); + if text.is_empty() { + continue; + } + turns.push(ConversationTurn { role, text }); + } + turns +} + +/// Extract only text segments from a Cursor IDE message content value. +/// Content is either a plain string or an array of `{type, text|...}` segs. +fn extract_text_only(content: &Value) -> String { + if let Some(s) = content.as_str() { + return s.to_string(); + } + if let Some(arr) = content.as_array() { + let mut parts = Vec::new(); + for seg in arr { + if let Some(seg_type) = seg.get("type").and_then(|t| t.as_str()) { + if seg_type == "text" { + if let Some(t) = seg.get("text").and_then(|t| t.as_str()) { + parts.push(t.to_string()); + } + } + } + } + return parts.join("\n"); + } + String::new() +} + +/// Recover the workspace path for a Cursor IDE session by scanning the +/// first user-role message for the `Workspace Path: ` header that +/// the IDE injects. Returns `None` if not discoverable — caller falls +/// back to the no-cwd-match path of the JSONL adapter. +pub fn cursor_app_session_workspace(db_path: &Path) -> Option { + let conn = open_ro(db_path).ok()?; + let meta_value: String = conn.query_row("SELECT value FROM meta LIMIT 1", [], |row| row.get(0)).ok()?; + let meta_bytes = hex_decode(&meta_value).ok()?; + let meta_json: Value = serde_json::from_slice(&meta_bytes).ok()?; + let root_id = meta_json.get("latestRootBlobId").and_then(|v| v.as_str())?.to_string(); + + let root_blob: Vec = conn.query_row("SELECT data FROM blobs WHERE id = ?", [&root_id], |row| row.get(0)).ok()?; + let child_ids = parse_root_blob_chain(&root_blob); + + for child_id in child_ids { + let data: Vec = match conn.query_row("SELECT data FROM blobs WHERE id = ?", [&child_id], |row| row.get(0)) { + Ok(b) => b, + Err(_) => continue, + }; + let v: Value = match serde_json::from_slice(&data) { + Ok(v) => v, + Err(_) => continue, + }; + let role = v.get("role").and_then(|r| r.as_str()).unwrap_or(""); + if role != "user" { + continue; + } + let text = extract_text_only(&v.get("content").cloned().unwrap_or(Value::Null)); + if let Some(line) = text.lines().find(|l| l.trim_start().starts_with("Workspace Path:")) { + let value = line.splitn(2, ':').nth(1)?.trim(); + if !value.is_empty() { + return Some(PathBuf::from(value)); + } + } + } + None +} + +/// Convenience: return the path to a session's store.db given the chats +/// base directory and a session id (the UUID). Reserved for future +/// id-targeted reads that don't want to walk every meta entry. +#[allow(dead_code)] +pub fn find_session_db(base: &Path, id: &str) -> Option { + let hash_iter = std::fs::read_dir(base).ok()?; + for hash_entry in hash_iter.flatten() { + let candidate = hash_entry.path().join(id).join("store.db"); + if candidate.is_file() { + return Some(candidate); + } + } + None +} + +/// ISO-8601 modified timestamp for the session's store.db. +pub fn cursor_app_modified_iso(db_path: &Path) -> Option { + crate::agents::file_modified_iso(db_path) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn varint_basic() { + assert_eq!(read_varint(&[0x00]), Some((0u64, 1))); + assert_eq!(read_varint(&[0x05]), Some((5u64, 1))); + // tag 0x0a = field 1, wire type 2 (length-delimited) + assert_eq!(read_varint(&[0x0a]), Some((0x0au64, 1))); + // 300 = 0xac 0x02 (varint) + assert_eq!(read_varint(&[0xac, 0x02]), Some((300u64, 2))); + } + + #[test] + fn hex_roundtrip() { + let bytes: Vec = (0..32).collect(); + let s = hex_encode(&bytes); + assert_eq!(s.len(), 64); + let back = hex_decode(&s).unwrap(); + assert_eq!(back, bytes); + } + + #[test] + fn root_blob_skips_unknown_tags() { + // Build: tag=0x0a (len-delim, field 1), len=32, 32 bytes of 0x01 + // then tag=0x2a (len-delim, field 5), len=4, payload [0x00..0x03] + let mut buf: Vec = Vec::new(); + buf.push(0x0a); + buf.push(32); + buf.extend([0x01u8; 32]); + buf.push(0x2a); + buf.push(4); + buf.extend([0x00u8, 0x01, 0x02, 0x03]); + + let chain = parse_root_blob_chain(&buf); + // Only the 32-byte payload should be picked up. + assert_eq!(chain.len(), 1); + assert_eq!(chain[0], hex_encode(&[0x01u8; 32])); + } +} diff --git a/cli/src/doctor.rs b/cli/src/doctor.rs index ab0e407..ae6ed0b 100644 --- a/cli/src/doctor.rs +++ b/cli/src/doctor.rs @@ -25,7 +25,13 @@ const PROVIDERS: &[Provider] = &[ Provider { agent: "gemini", target_file: "GEMINI.md" }, ]; -const ALL_AGENTS: &[&str] = &["codex", "gemini", "claude", "cursor", "hermes"]; +// Agents enumerated for session-discovery checks. `cursor` is intentionally +// absent: it has two surfaces (CLI JSONL and IDE SQLite) reported by the +// `cursor_session_checks` helper below, not a single combined `sessions_cursor`. +// `hermes` is also absent: it's a provisional adapter whose presence we report +// via the `hermes_surface_check` helper so we can downgrade to `info` when the +// hermes data directory is absent (F12 parity with cursor). +const ALL_AGENTS: &[&str] = &["codex", "gemini", "claude"]; #[derive(Debug)] pub struct Check { @@ -93,13 +99,24 @@ pub fn run_doctor(cwd: &str) -> Result { &fmt_existence(&gemini_base), ); - // Setup scaffolding + // Setup scaffolding. The integration/snippet/intents checks emit `info` + // rather than `warn` when the repo has not been initialized via + // `chorus setup` — un-setup is intentional state, not broken state. + // + // Initialization is detected by the presence of either INTENTS.md or + // the providers/ directory under .agent-chorus/. The bare .agent-chorus/ + // directory alone is *not* a setup signal: the messaging subsystem + // creates .agent-chorus/messages/ for inbox storage on first `send`, + // independent of any setup step. let setup_root = cwd_path.join(".agent-chorus"); + let setup_initialized = setup_root.join("INTENTS.md").exists() + || setup_root.join("providers").exists(); + let absent_status = if setup_initialized { "warn" } else { "info" }; let intents_path = setup_root.join("INTENTS.md"); push( &mut checks, "setup_intents", - if intents_path.exists() { "pass" } else { "warn" }, + if intents_path.exists() { "pass" } else { absent_status }, &fmt_existence(&intents_path), ); @@ -111,7 +128,7 @@ pub fn run_doctor(cwd: &str) -> Result { push( &mut checks, &format!("snippet_{}", provider.agent), - if snippet_path.exists() { "pass" } else { "warn" }, + if snippet_path.exists() { "pass" } else { absent_status }, &fmt_existence(&snippet_path), ); @@ -120,7 +137,7 @@ pub fn run_doctor(cwd: &str) -> Result { push( &mut checks, &format!("integration_{}", provider.agent), - "warn", + absent_status, &format!("Missing provider instruction file: {}", target_path.display()), ); continue; @@ -133,7 +150,7 @@ pub fn run_doctor(cwd: &str) -> Result { push( &mut checks, &format!("integration_{}", provider.agent), - if present { "pass" } else { "warn" }, + if present { "pass" } else { absent_status }, &if present { format!("Managed block present in {}", target_path.display()) } else { @@ -142,7 +159,7 @@ pub fn run_doctor(cwd: &str) -> Result { ); } - // Session discovery per agent + // Session discovery per agent. let normalized_cwd = utils::normalize_path(cwd) .map(|p| p.to_string_lossy().to_string()) .unwrap_or_else(|_| cwd.to_string()); @@ -177,6 +194,15 @@ pub fn run_doctor(cwd: &str) -> Result { } } + // Cursor has two on-disk surfaces; report each independently. The + // surface check answers "is this surface reachable from this host?", + // not "are there sessions for this specific cwd?" — pass-with-no-cwd + // matches Node's semantic. + cursor_surface_checks(&mut checks); + hermes_surface_check(&mut checks); + env_override_checks(&mut checks); + stale_snippet_checks(&mut checks, cwd_path); + // Context pack state let pack_dir = cwd_path.join(".agent-context").join("current"); let manifest_path = pack_dir.join("manifest.json"); @@ -254,48 +280,58 @@ pub fn run_doctor(cwd: &str) -> Result { ); } - // Git hooks path + pre-push - let hooks_path = git_hooks_path(cwd_path); - match hooks_path { - Some(ref hp) => { - push( - &mut checks, - "context_pack_hooks_path", - if hp == ".githooks" { "pass" } else { "warn" }, - &if hp == ".githooks" { - "Git hooks path set to .githooks".to_string() - } else { - format!( - "Git hooks path is {} (expected .githooks for context-pack pre-push automation)", - hp - ) - }, - ); - let pre_push = if Path::new(hp).is_absolute() { - PathBuf::from(hp).join("pre-push") + // Git hooks path + pre-push. + // + // F3: doctor reports the *local* health of this install in this cwd. + // If the cwd is not a git repository, neither hooks_path nor pre_push + // checks are meaningful — `git config core.hooksPath` would resolve to + // a global value (the user's `~/.git-hooks` or similar), and we'd + // truthfully report a hook as "installed" even though the cwd has no + // `.git/` at all. That's a local lie. Gate both checks on the cwd + // actually being a git repo and report `info` otherwise. + if is_git_repo(cwd_path) { + let configured = git_hooks_path(cwd_path); + let (effective_path, source) = match configured.as_deref() { + Some(hp) => (hp.to_string(), "configured"), + None => (".git/hooks".to_string(), "default"), + }; + push( + &mut checks, + "context_pack_hooks_path", + "info", + &format!("Effective git hooks path: {} ({})", effective_path, source), + ); + let pre_push = if Path::new(&effective_path).is_absolute() { + PathBuf::from(&effective_path).join("pre-push") + } else { + cwd_path.join(&effective_path).join("pre-push") + }; + push( + &mut checks, + "context_pack_pre_push", + if pre_push.exists() { "pass" } else { "warn" }, + &if pre_push.exists() { + format!("Found: {}", pre_push.display()) } else { - cwd_path.join(hp).join("pre-push") - }; - push( - &mut checks, - "context_pack_pre_push", - if pre_push.exists() { "pass" } else { "warn" }, - &if pre_push.exists() { - format!("Found: {}", pre_push.display()) - } else { - format!( - "Missing: {} (run: chorus agent-context install-hooks)", - pre_push.display() - ) - }, - ); - } - None => push( + format!( + "Missing: {} (run: chorus agent-context install-hooks)", + pre_push.display() + ) + }, + ); + } else { + push( &mut checks, "context_pack_hooks_path", - "warn", - "Git hooks path not configured", - ), + "info", + "cwd is not a git repository; git hooks checks skipped", + ); + push( + &mut checks, + "context_pack_pre_push", + "info", + "cwd is not a git repository; pre-push hook check skipped", + ); } let has_fail = checks.iter().any(|c| c.status == "fail"); @@ -315,6 +351,195 @@ pub fn run_doctor(cwd: &str) -> Result { }) } +fn cursor_surface_checks(checks: &mut Vec) { + // F12: cursor-cli and Cursor-IDE surfaces are independently optional. + // When *neither* surface has sessions AND the surface's data directory + // doesn't exist, the user simply hasn't installed cursor-agent or the + // Cursor IDE in any usable way — that's intentional state, not broken + // state. Report `info` in that case. Report `warn` only when the data + // directory exists but contains zero sessions (meaning the user has + // the tool installed but produces no sessions — worth flagging). + let cli_base = crate::agents::cursor_base_dir_public(); + let app_base = crate::cursor_app::cursor_app_base_dir(); + + let (cli_status, cli_detail) = if !cli_base.exists() { + ( + "info", + format!( + "cursor-agent CLI not configured (data directory absent: {})", + cli_base.display() + ), + ) + } else if crate::agents::list_cursor_cli_sessions_count(None, 1) > 0 { + ("pass", "At least one cursor-agent CLI transcript discovered".to_string()) + } else { + ( + "warn", + format!("No cursor-agent CLI transcripts discovered at {}", cli_base.display()), + ) + }; + push(checks, "sessions_cursor_cli", cli_status, &cli_detail); + + let (app_status, app_detail) = if !app_base.exists() { + ( + "info", + format!( + "Cursor IDE not configured (data directory absent: {})", + app_base.display() + ), + ) + } else if !crate::cursor_app::collect_cursor_app_sessions(&app_base).is_empty() { + ("pass", "At least one Cursor IDE store.db discovered".to_string()) + } else { + ( + "warn", + format!("No Cursor IDE store.db sessions discovered at {}", app_base.display()), + ) + }; + push(checks, "sessions_cursor_app", app_status, &app_detail); +} + +/// Stale-snippet sentinel: a provider snippet or managed block that exists +/// but predates the current contract (no "History contract" section) is +/// silently leaving consumer agents without the on-demand history rule. +/// `chorus setup --force` refreshes them; doctor surfaces the gap so the +/// user knows to do that. +fn stale_snippet_checks(checks: &mut Vec, cwd: &Path) { + // Look for the load-bearing phrase introduced in v0.16.0. If the file + // exists but lacks it, the consumer is on a pre-contract snippet. + let probe = "History contract"; + let providers_dir = cwd.join(".agent-chorus").join("providers"); + let provider_files = [ + ("codex", providers_dir.join("codex.md")), + ("claude", providers_dir.join("claude.md")), + ("gemini", providers_dir.join("gemini.md")), + ]; + for (agent, path) in &provider_files { + if !path.exists() { + continue; + } + let stale = std::fs::read_to_string(path) + .map(|s| !s.contains(probe)) + .unwrap_or(false); + if stale { + push( + checks, + &format!("snippet_{}_stale", agent), + "warn", + &format!( + "{} predates the v0.16.0 history contract. Run `chorus setup --force` to refresh.", + path.display() + ), + ); + } + } + // Managed-block check: same probe, in the integration files. + let managed_files = [ + ("codex", cwd.join("AGENTS.md")), + ("claude", cwd.join("CLAUDE.md")), + ("gemini", cwd.join("GEMINI.md")), + ]; + for (agent, path) in &managed_files { + if !path.exists() { + continue; + } + let marker = format!("agent-chorus:{}:start", agent); + if let Ok(content) = std::fs::read_to_string(path) { + if !content.contains(&marker) { + continue; // no managed block; integration check handles this + } + if !content.contains(probe) { + push( + checks, + &format!("integration_{}_stale", agent), + "warn", + &format!( + "Managed block in {} predates the v0.16.0 history contract. Run `chorus setup --force` to refresh.", + path.display() + ), + ); + } + } + } +} + +fn hermes_surface_check(checks: &mut Vec) { + // F12 parity: hermes is provisional. When its data directory is + // absent, the user simply hasn't installed hermes — report `info`, + // not `warn`. `warn` is reserved for "directory exists but no + // sessions" (installed but quiet). + let base = crate::agents::hermes_base_dir_public(); + let (status, detail) = if !base.exists() { + ( + "info", + format!( + "Hermes not configured (data directory absent: {})", + base.display() + ), + ) + } else { + match crate::adapters::get_adapter("hermes") + .and_then(|a| a.list_sessions(None, 1).ok()) + { + Some(entries) if !entries.is_empty() => { + ("pass", "At least one hermes session discovered".to_string()) + } + _ => ( + "warn", + format!("No hermes sessions discovered at {}", base.display()), + ), + } + }; + push(checks, "sessions_hermes", status, &detail); +} + +/// F2: env-var overrides pointing at non-existent directories produce +/// silent partial coverage that looks identical to a working install. +/// Doctor explicitly flags these as `warn` so users know their env is +/// misconfigured. The override variable name and the dangling path are +/// both included in the detail for easy diagnosis. +fn env_override_checks(checks: &mut Vec) { + let overrides = [ + ("CHORUS_CODEX_SESSIONS_DIR", "codex"), + ("BRIDGE_CODEX_SESSIONS_DIR", "codex (legacy)"), + ("CHORUS_CLAUDE_PROJECTS_DIR", "claude"), + ("BRIDGE_CLAUDE_PROJECTS_DIR", "claude (legacy)"), + ("CHORUS_GEMINI_TMP_DIR", "gemini"), + ("BRIDGE_GEMINI_TMP_DIR", "gemini (legacy)"), + ("CHORUS_CURSOR_DATA_DIR", "cursor-agent CLI"), + ("BRIDGE_CURSOR_DATA_DIR", "cursor-agent CLI (legacy)"), + ("CHORUS_CURSOR_APP_DATA_DIR", "Cursor IDE"), + ("BRIDGE_CURSOR_APP_DATA_DIR", "Cursor IDE (legacy)"), + ("CHORUS_HERMES_DATA_DIR", "hermes"), + ("BRIDGE_HERMES_DATA_DIR", "hermes (legacy)"), + ]; + for (var, label) in overrides.iter() { + if let Ok(value) = std::env::var(var) { + if value.is_empty() { + continue; + } + let expanded = if let Some(stripped) = value.strip_prefix("~/") { + dirs::home_dir() + .map(|h| h.join(stripped)) + .unwrap_or_else(|| std::path::PathBuf::from(&value)) + } else { + std::path::PathBuf::from(&value) + }; + if !expanded.exists() { + push( + checks, + "env_override_dangling", + "warn", + &format!( + "{} ({}) points at non-existent directory: {}. Sessions from this adapter will be invisible until the env var is cleared or the directory exists.", + var, label, expanded.display() + ), + ); + } + } + } +} + fn push(checks: &mut Vec, id: &str, status: &str, detail: &str) { checks.push(Check { id: id.to_string(), @@ -371,6 +596,18 @@ fn claude_plugin_installed() -> bool { .unwrap_or(false) } +/// Whether the given directory (or any ancestor) is a git repository. +/// Used to gate the hooks-path / pre-push checks so we never claim a +/// hook is installed when the cwd has no `.git/` at all. +fn is_git_repo(cwd: &Path) -> bool { + Command::new("git") + .args(["rev-parse", "--git-dir"]) + .current_dir(cwd) + .output() + .map(|o| o.status.success()) + .unwrap_or(false) +} + fn git_hooks_path(cwd: &Path) -> Option { let out = Command::new("git") .args(["config", "--get", "core.hooksPath"]) diff --git a/cli/src/main.rs b/cli/src/main.rs index 220883c..09ddd7d 100644 --- a/cli/src/main.rs +++ b/cli/src/main.rs @@ -2,6 +2,7 @@ mod adapters; mod agents; mod agent_context; mod checkpoint; +mod cursor_app; mod cursor_cwd; mod cursor_parse; pub mod diff; @@ -73,6 +74,16 @@ enum Commands { #[arg(long = "tool-calls")] tool_calls: bool, + /// History scope. Default `on-demand` returns only the latest session + /// for the cwd — chorus does NOT auto-pull prior sessions into the + /// returned content; consumers should call `chorus list / timeline / + /// search` explicitly when they need historical context. `none` is + /// equivalent to `--metadata-only`. `eager` is reserved for a future + /// multi-session merge and behaves identically to `on-demand` today + /// (rejected with a warning so consumers don't silently rely on it). + #[arg(long, default_value = "on-demand")] + history: String, + /// Output format: json | md | markdown (default: text) #[arg(long)] format: Option, @@ -98,6 +109,38 @@ enum Commands { }, /// Build a report from a handoff packet JSON file + #[command(long_about = "Build a report from a handoff packet JSON file. + +The handoff JSON must conform to this exact shape (unknown fields produce INVALID_HANDOFF): + + { + \"mode\": \"analyze\", // required + \"task\": \"\", // required + \"success_criteria\": [\"\", ...], // required, non-empty + \"sources\": [ + { + \"agent\": \"claude\", // required + \"session_id\": \"\", // OR set current_session:true + \"current_session\": true, // use latest for cwd + \"cwd\": \"\", // optional override + \"last_n\": 10 // optional N msgs/source + } + ], + \"constraints\": [\"\", ...] // optional + } + +Minimal copy-pasteable example (write to handoff.json): + + { + \"mode\": \"analyze\", + \"task\": \"Compare claude and codex outputs\", + \"success_criteria\": [\"Identify agreements and contradictions\"], + \"sources\": [ + {\"agent\": \"claude\", \"current_session\": true}, + {\"agent\": \"codex\", \"current_session\": true} + ] + } +")] Report { /// Path to handoff JSON file #[arg(long)] @@ -163,6 +206,18 @@ enum Commands { }, /// Initialize agent-chorus in the current directory (provider blocks, scaffolding, optional context-pack) + #[command(long_about = "Initialize agent-chorus in the current directory. + +setup creates or updates: + CLAUDE.md / AGENTS.md / GEMINI.md chorus managed blocks for agent wiring + .agent-chorus/ provider snippets and intent contract + .gitignore adds .agent-chorus/ to prevent tracking + claude plugin auto-installs Claude Code plugin if claude CLI is present + +Run teardown to reverse all per-project operations. + +The Claude Code plugin is global — uninstall separately if desired: + claude plugin uninstall agent-chorus")] Setup { /// Working directory (default: current directory) #[arg(long)] @@ -357,6 +412,17 @@ enum Commands { }, /// Diagnostic checks across the agent-chorus install + #[command(long_about = "Diagnostic checks across the agent-chorus install. + +Severity levels: + pass Check succeeded. + info Informational; not a problem (e.g. optional feature not configured). + warn Actionable; the install can run but something should be fixed. + fail Broken/unrecoverable. + +Overall is elevated only by warn or fail; info does not elevate it. + +Checks: version, session directories, setup completeness, provider instruction wiring, session availability, context pack state, Claude Code plugin installation, update status, hooks path + pre-push.")] Doctor { /// Working directory to scope checks #[arg(long)] @@ -769,8 +835,26 @@ fn run(cli: Cli) -> Result<()> { audit_redactions, include_user, tool_calls, + history, format, } => { + // N7: validate --history early. The flag is forward-compat + // scaffolding for an eventual multi-session merge; today the + // chorus default is `on-demand` (latest session for cwd, no + // auto-recall of older sessions). `none` is an alias for + // --metadata-only; `eager` is reserved and behaves identically + // to `on-demand` with a warning so consumers don't silently + // rely on a behavior chorus doesn't yet implement. + let history_mode = match history.as_str() { + "on-demand" | "none" | "eager" => history.clone(), + other => { + return Err(anyhow::anyhow!( + "Invalid --history value: {}. Allowed: on-demand | none | eager.", + other + )); + } + }; + let history_metadata_only = history_mode == "none" || metadata_only; let effective_cwd = effective_cwd(cwd); let last_n = last.max(1); let adapter = adapters::get_adapter(agent.as_str()) @@ -779,7 +863,7 @@ fn run(cli: Cli) -> Result<()> { include_user, include_tool_calls: tool_calls, }; - let session = adapter.read_session_with_options( + let mut session = adapter.read_session_with_options( id.as_deref(), &effective_cwd, chats_dir.as_deref(), @@ -787,6 +871,28 @@ fn run(cli: Cli) -> Result<()> { opts, )?; + // N6: agents whose on-disk format does not carry tool calls emit + // a uniform warning when --tool-calls is requested, so a silent + // no-op never looks like "this agent had no tool calls". The + // `included_tool_calls: true` field is still set in the output + // since the flag was honored; the warning surfaces that the data + // is structurally unavailable. + if tool_calls && agent_has_no_tool_calls(agent.as_str()) { + session.warnings.push(format!( + "--tool-calls has no effect for {} sessions: this agent's transcript format does not carry tool calls.", + agent.as_str() + )); + } + + // N7: --history=eager is reserved for a future multi-session + // merge. Today chorus does not implement it; emit a warning so + // consumers don't silently rely on the option being honored. + if history_mode == "eager" { + session.warnings.push( + "--history=eager is reserved for a future multi-session merge and currently behaves identically to --history=on-demand. Use `chorus list / timeline / search` to pull additional sessions explicitly.".to_string() + ); + } + // If audit mode requested, re-run redaction with audit on the raw content let redaction_audit = if audit_redactions { let (_, audit) = agents::redact_sensitive_text_with_audit(&session.content); @@ -810,7 +916,7 @@ fn run(cli: Cli) -> Result<()> { let want_markdown = !want_json && is_markdown_format(format_str); if want_json { - let content_value = if metadata_only { + let content_value = if history_metadata_only { serde_json::Value::Null } else { serde_json::Value::String(session.content.clone()) @@ -834,6 +940,23 @@ fn run(cli: Cli) -> Result<()> { serde_json::Value::Bool(true), ); } + // F1: surface fallback as a structured boolean in addition + // to the existing `warnings[]` push. JSON-only consumers + // can rely on this without scanning warning strings. + // Also echo the cwd_mismatch warning to stderr so humans + // watching the terminal see the fallback even when stdout + // is being piped into another tool. + if session.cwd_mismatch { + report.as_object_mut().unwrap().insert( + "cwd_mismatch".to_string(), + serde_json::Value::Bool(true), + ); + for w in &session.warnings { + if w.contains("falling back to latest session") { + eprintln!("chorus: {}", w); + } + } + } if let Some(ref audit) = redaction_audit { report.as_object_mut().unwrap().insert( "redactions".to_string(), @@ -1405,6 +1528,15 @@ fn emit_report_output(report_value: &serde_json::Value, json_output: bool) -> Re Ok(()) } +/// Agents whose on-disk transcript format has no tool-call concept. +/// `--tool-calls` is honored (the flag is acknowledged in +/// `included_tool_calls`) but a uniform warning surfaces that the data +/// is structurally unavailable. Mirrors the Node check in +/// `scripts/read_session.cjs`. +fn agent_has_no_tool_calls(agent: &str) -> bool { + matches!(agent, "gemini" | "hermes") +} + fn effective_cwd(cwd: Option) -> String { cwd.unwrap_or_else(|| { std::env::current_dir() diff --git a/cli/src/setup.rs b/cli/src/setup.rs index 82de3bc..25328bc 100644 --- a/cli/src/setup.rs +++ b/cli/src/setup.rs @@ -485,6 +485,12 @@ fn make_managed_block(agent: &str, snippet_rel_path: &str) -> String { "When a user asks for another agent status (for example \"What is Claude doing?\"),".to_string(), "run Agent Chorus commands first and answer with evidence from session output.".to_string(), String::new(), + "**History contract (READ FIRST — violating this costs 2.5x tokens):**".to_string(), + "- `chorus read` defaults to `--history=on-demand` — latest session for the cwd ONLY.".to_string(), + "- Do NOT loop through prior sessions at session start. The field study measured a 2.5x token inflation when agents eagerly read history.".to_string(), + "- When you need historical context, call `chorus list / timeline / search` EXPLICITLY. That is the on-demand recall mechanism.".to_string(), + "- `--history=eager` is reserved for a future multi-session merge and currently emits a warning; do not depend on it.".to_string(), + String::new(), "Session routing and defaults:".to_string(), "1. For status checks like \"What is Claude doing?\", start with `chorus read --agent --cwd --include-user --json` (omit `--id` for latest).".to_string(), "2. For plain handoff/output checks, use `chorus read --agent --cwd --json`.".to_string(), @@ -497,6 +503,13 @@ fn make_managed_block(agent: &str, snippet_rel_path: &str) -> String { "- `chorus list --agent --cwd --json`".to_string(), "- `chorus search \"\" --agent --cwd --json`".to_string(), "- `chorus compare --source codex --source gemini --source claude --cwd --json`".to_string(), + "- `chorus diff --agent --from --to --cwd --json`".to_string(), + "- `chorus read --agent --cwd --audit-redactions --json`".to_string(), + "- `chorus relevance --list --cwd --json`".to_string(), + "- `chorus send --from --to --message \"\" --cwd `".to_string(), + "- `chorus messages --agent --cwd --json`".to_string(), + String::new(), + "(History contract is at the top of this block — see above.)".to_string(), String::new(), "If command syntax is unclear, run `chorus --help`.".to_string(), format!("", marker), @@ -535,6 +548,12 @@ fn provider_snippet(agent: &str) -> String { "- `chorus search \"\" --agent --cwd --json`".to_string(), "- `chorus compare --source codex --source gemini --source claude --cwd --json`".to_string(), String::new(), + "History contract (do NOT eagerly read multiple prior sessions):".to_string(), + "- `chorus read` defaults to `--history=on-demand` — latest session for the cwd ONLY.".to_string(), + "- Do NOT loop through prior sessions at session start. The field study (research/context-pack-field-findings-2026-03-20.md, Finding 3) measured a 2.5x token inflation when agents eagerly read history. Honor on-demand by default.".to_string(), + "- When you genuinely need historical context, call `chorus list / timeline / search` explicitly. That's the on-demand recall mechanism.".to_string(), + "- `--history=eager` is reserved for a future multi-session merge and currently emits a warning; do not depend on it.".to_string(), + String::new(), "Use evidence from command output and explicitly report missing session data.".to_string(), ] .join("\n") @@ -586,11 +605,15 @@ fn is_system_directory(dir: &Path) -> bool { fn relative_path(base: &Path, target: &Path) -> Option { let base = base.canonicalize().unwrap_or_else(|_| base.to_path_buf()); - // target may not exist yet — canonicalize only components that do, else fall back. + // target may not exist yet (we're about to create it). Canonicalize the + // nearest existing ancestor and re-append the missing tail, so the + // result shares a canonical prefix with `base`. Without this, when + // `base` is canonicalized through a symlink (e.g. macOS /tmp → + // /private/tmp) and `target` still has the symlink form, the common + // prefix is empty and we end up with `../../../tmp/` instead of + // a clean relative path. let target_buf = target.to_path_buf(); - let target = target_buf - .canonicalize() - .unwrap_or_else(|_| target_buf.clone()); + let target = canonicalize_via_parent(&target_buf); let base_components: Vec<_> = base.components().collect(); let target_components: Vec<_> = target.components().collect(); // Find common prefix @@ -613,6 +636,21 @@ fn relative_path(base: &Path, target: &Path) -> Option { } } +/// Return a canonical-ish PathBuf for a target that may not yet exist. +/// Walks up to the nearest existing ancestor, canonicalizes that, and +/// re-appends the remaining tail. Returns the original path verbatim if +/// no ancestor exists or canonicalization fails. +fn canonicalize_via_parent(target: &Path) -> PathBuf { + let mut ancestors = target.ancestors(); + while let Some(ancestor) = ancestors.next() { + if let Ok(canonical) = ancestor.canonicalize() { + let tail = target.strip_prefix(ancestor).unwrap_or(Path::new("")); + return canonical.join(tail); + } + } + target.to_path_buf() +} + fn is_command_available(cmd: &str) -> bool { Command::new("which") .arg(cmd) diff --git a/cli/src/timeline.rs b/cli/src/timeline.rs index f93bb76..ba0e0ea 100644 --- a/cli/src/timeline.rs +++ b/cli/src/timeline.rs @@ -204,9 +204,14 @@ pub fn build_timeline( .and_then(|v| v.as_str()) .unwrap_or(""); - // Try to read a snippet from the session - let snippet = if !file_path.is_empty() { - read_snippet(agent, file_path) + // Try to read a snippet from the session. Pass the session_id + // straight through rather than re-deriving from file_path: + // for cursor IDE app sessions the file is `...//store.db`, + // so file_stem() returns "store" and the lookup misses. + let snippet = if !session_id.is_empty() { + read_snippet(agent, &session_id) + } else if !file_path.is_empty() { + read_snippet_by_file(agent, file_path) } else { None }; @@ -241,27 +246,29 @@ pub fn build_timeline( }) } -/// Try to read the first assistant snippet from a session file. -fn read_snippet(agent: &str, file_path: &str) -> Option { +/// Try to read the first assistant snippet from a session by id. +fn read_snippet(agent: &str, session_id: &str) -> Option { let adapter = adapters::get_adapter(agent)?; let session = adapter - .read_session( - Some( - std::path::Path::new(file_path) - .file_stem() - .and_then(|s| s.to_str()) - .unwrap_or(""), - ), - ".", - None, - 1, - ) + .read_session(Some(session_id), ".", None, 1) .ok()?; - if session.content.is_empty() { None } else { - let short: String = session.content.chars().take(200).collect(); - Some(short) + Some(session.content.chars().take(200).collect()) + } +} + +/// Fallback: derive id from file path's stem. Used only when the listing +/// didn't carry a session_id (defensive — current adapters all populate it). +fn read_snippet_by_file(agent: &str, file_path: &str) -> Option { + let id = std::path::Path::new(file_path) + .file_stem() + .and_then(|s| s.to_str()) + .unwrap_or(""); + if id.is_empty() { + None + } else { + read_snippet(agent, id) } } diff --git a/docs/CLI_REFERENCE.md b/docs/CLI_REFERENCE.md index f0b232f..37bf0e7 100644 --- a/docs/CLI_REFERENCE.md +++ b/docs/CLI_REFERENCE.md @@ -5,7 +5,7 @@ Use this page for full command syntax, examples, output contracts, and operation ## Command Contract ```bash -chorus read --agent [--id=] [--cwd=] [--chats-dir=] [--last=] [--include-user] [--tool-calls] [--format=] [--json] [--metadata-only] [--audit-redactions] +chorus read --agent [--id=] [--cwd=] [--chats-dir=] [--last=] [--include-user] [--tool-calls] [--history=] [--format=] [--json] [--metadata-only] [--audit-redactions] chorus summary --agent [--cwd=] [--format=] [--json] chorus timeline [--agent ]... [--cwd=] [--limit=] [--format=] [--json] chorus compare --source ... [--cwd=] [--last=] [--json] @@ -83,7 +83,30 @@ chorus read --agent claude --tool-calls --include-user --json The JSON response includes `"included_tool_calls": true` in metadata when active. Without the flag, behavior is unchanged. -**Behaviour note — Gemini and Cursor:** `--tool-calls` runs without error on these agents but currently surfaces no `[TOOL: ...]` blocks. The Gemini JSONL and Cursor state stores do not carry a tool-call schema that the adapters parse yet. Applies to both Node and Rust. +**Behaviour note — Gemini and Hermes (uniform NOT_AVAILABLE warning, v0.16.0):** +The Gemini JSONL transcript format and the (provisional) Hermes session +format do not carry tool-call structure that the adapters can surface. +When `--tool-calls` is passed for these agents, the command runs without +error, `included_tool_calls: true` is still emitted (the flag was +honored), and a uniform warning is pushed into `result.warnings`: + +``` +--tool-calls has no effect for sessions: this agent's transcript format does not carry tool calls. +``` + +The exact phrasing is byte-identical between Node and Rust dispatch, so +consumers can match on it deterministically. This warning is what +distinguishes "agent format genuinely has no tool calls" from the prior +silent no-op (which looked indistinguishable from "the session had no +tool calls"). Mirrors `AGENTS_WITHOUT_TOOL_CALLS` in +`scripts/read_session.cjs` and `agent_has_no_tool_calls` in +`cli/src/main.rs`. + +Cursor (both CLI JSONL and IDE `store.db` surfaces) runs `--tool-calls` +without error but does not currently emit `[TOOL: ...]` blocks; this +behavior is tracked for a follow-up rather than escalated to the uniform +warning because the cursor surfaces *do* carry tool-call data — the +adapters just don't surface it yet. ### Read Flag Reference @@ -96,6 +119,7 @@ The JSON response includes `"included_tool_calls": true` in metadata when active | `--last` | Number of trailing assistant messages to include | 1 | | `--include-user` | Include the paired user prompt(s) with each assistant message | off | | `--tool-calls` | Surface `[TOOL: ]...[/TOOL]` blocks in `content` | off | +| `--history` | History scope: `on-demand` (default, latest session only), `none` (metadata only), `eager` (reserved — emits warning) | `on-demand` | | `--format` | Output format (`json`, `md` / `markdown`) | text unless `--json` | | `--json` | Machine-readable JSON output | off | | `--metadata-only` | Return metadata without `content` | off | @@ -103,6 +127,72 @@ The JSON response includes `"included_tool_calls": true` in metadata when active **`--format` vs `--json`:** Rust treats `--format json` as an alias for `--json`. **Node has a bug here** — `--format json` falls through to plain-text output instead of JSON (see `scripts/read_session.cjs:1759`). The bug is documented and left in place because fixing it is a user-visible output-contract change; use `--json` for JSON output on both runtimes. +### History Contract (`--history`, v0.16.0) + +`chorus read` is single-session by design. The `--history` flag makes that +contract explicit; the default (`on-demand`) is what consumers should +nearly always use. + +| Value | Semantics | +|---|---| +| `on-demand` (default) | Return ONLY the latest session for the cwd. Chorus does NOT auto-pull prior sessions into the returned content. When historical context is needed, the consumer calls `chorus list`, `chorus timeline`, or `chorus search` EXPLICITLY. This is the "on-demand recall" pattern — field measurements found a 2.5x token inflation when agents eagerly read multiple prior sessions, so the default is deliberately narrow. | +| `none` | Equivalent to `--metadata-only`. The JSON `content` field is `null`; text output omits the content block. Useful for cheap session-existence probes and routing decisions. | +| `eager` | RESERVED for a future multi-session merge. Today it behaves identically to `on-demand` AND pushes a warning into `result.warnings` so consumers cannot silently come to depend on it: `--history=eager is reserved for a future multi-session merge and currently behaves identically to --history=on-demand. Use \`chorus list / timeline / search\` to pull additional sessions explicitly.` | + +Invalid values are rejected at parse time on both runtimes (e.g. `--history=full` exits non-zero with `Invalid --history value: full. Allowed: on-demand | none | eager.`). + +```bash +# Default — single latest session for the cwd +chorus read --agent claude --cwd . --json + +# Metadata-only probe ("does claude have any session for this cwd?") +chorus read --agent claude --cwd . --history none --json + +# Reserved value — works, but pushes a warning into the JSON +chorus read --agent claude --cwd . --history eager --json +``` + +The history contract is also written into provider snippets and the +`CLAUDE.md` / `AGENTS.md` / `GEMINI.md` managed blocks by `chorus setup` +(v0.16.0+), so consuming agents are reminded of the on-demand rule +inside their own instruction files. See "Setup" below and the +stale-snippet checks under "Doctor". + +### `cwd_mismatch` (v0.16.0) + +When `--cwd ` is passed but no session matches and the adapter +falls back to the latest session anyway (the long-standing Codex / +Claude / Cursor behavior — see Rule 4 in `PROTOCOL.md`), the JSON output +now carries an explicit boolean: + +```json +{ + "agent": "codex", + "cwd": "/workspace/missing-project", + "warnings": [ + "Warning: no Codex session matched cwd /workspace/missing-project; falling back to latest session." + ], + "cwd_mismatch": true +} +``` + +The field is **only emitted when the fallback fires**. When `--cwd` +resolves cleanly, `cwd_mismatch` is absent from the output (it is NOT +emitted as `false`). This keeps JSON consumers honest: any code that +checks `result.cwd_mismatch === true` will detect the silent-fallback +case without scanning the warnings array. + +In addition, the same warning string is mirrored to **stderr** prefixed +with `chorus:`: + +``` +chorus: Warning: no Codex session matched cwd /workspace/missing-project; falling back to latest session. +``` + +Stderr-watching humans see it immediately even when stdout is +JSON-piped. Schema: see `cwd_mismatch` in +[`schemas/read-output.schema.json`](../schemas/read-output.schema.json). + ## Session Summary Structured session digest without reading full content. Extracts metadata locally — no LLM calls. Node and Rust emit byte-identical JSON for the same inputs (Rust parity landed in v0.13.0). @@ -277,6 +367,44 @@ chorus list --agent codex --cwd /path/to/project --json ] ``` +### Cursor-only `source` field (v0.16.0) + +`chorus list --agent cursor` and `chorus search --agent cursor` entries +carry an extra string field — `"source": "cli" | "app"` — distinguishing +the two on-disk Cursor surfaces: + +| Value | Surface | Backing store | +|---|---|---| +| `"cli"` | cursor-agent CLI transcripts | `~/.cursor/projects//agent-transcripts//*.jsonl` | +| `"app"` | Cursor IDE workspace chats | `~/.cursor/chats///store.db` (SQLite) | + +Example: + +```json +[ + { + "session_id": "store", + "agent": "cursor", + "source": "app", + "cwd": "/Users/me/code/app", + "modified_at": "2026-05-22T18:11:00Z", + "file_path": "/Users/me/.cursor/chats/abc.../uuid.../store.db" + }, + { + "session_id": "abcd1234-...", + "agent": "cursor", + "source": "cli", + "cwd": "/Users/me/code/app", + "modified_at": "2026-05-22T17:42:00Z", + "file_path": "/Users/me/.cursor/projects/-Users-me-code-app/agent-transcripts/abcd.../abcd....jsonl" + } +] +``` + +The `source` field is **cursor-only** — it is not emitted for codex, +claude, gemini, or hermes. List/search results from those agents +retain the existing schema unchanged. + ## Searching Sessions ```bash @@ -306,10 +434,66 @@ The `--last N` flag controls how many recent assistant messages to read from eac ## Reporting +Build a structured cross-agent report from a handoff packet (a JSON file +that names the task, success criteria, and source sessions to compare). + ```bash chorus report --handoff ./handoff_packet.json --json ``` +### Handoff Schema (v0.16.0 — surfaced in `--help`) + +The full schema is now embedded in `chorus report --help` (Rust CLI) so +operators don't need to leave the terminal to see it. Reproduced here +for searchability; the canonical source is +[`schemas/handoff.schema.json`](../schemas/handoff.schema.json). + +```json +{ + "mode": "analyze", + "task": "", + "success_criteria": ["", ...], + "sources": [ + { + "agent": "claude", + "session_id": "", + "current_session": true, + "cwd": "", + "last_n": 10 + } + ], + "constraints": ["", ...] +} +``` + +**Required fields:** `mode`, `task`, `success_criteria` (non-empty), +`sources` (each entry requires `agent`, plus either `session_id` OR +`current_session: true`). + +**Optional fields:** `cwd` and `last_n` per-source, top-level +`constraints`. + +**Strictness:** unknown fields produce `INVALID_HANDOFF`. `mode` must be +one of the canonical modes (`verify`, `steer`, `analyze`, `feedback`). + +Minimal copy-pasteable example (write to `handoff.json`): + +```json +{ + "mode": "analyze", + "task": "Compare claude and codex outputs", + "success_criteria": ["Identify agreements and contradictions"], + "sources": [ + {"agent": "claude", "current_session": true}, + {"agent": "codex", "current_session": true} + ] +} +``` + +```bash +chorus report --handoff handoff.json --json +``` + ## Context Pack ```bash @@ -820,11 +1004,18 @@ Setup performs these operations: | Operation | File / Target | Notes | |---|---|---| | `file` | `.agent-chorus/INTENTS.md` | Intent contract (skipped if exists unless --force) | -| `file` | `.agent-chorus/providers/{claude,codex,gemini}.md` | Per-agent trigger snippets | -| `integration` | `CLAUDE.md` / `AGENTS.md` / `GEMINI.md` | Managed blocks injected or created | +| `file` | `.agent-chorus/providers/{claude,codex,gemini}.md` | Per-agent trigger snippets. v0.16.0+ snippets carry a top-of-block "History contract" section that documents the on-demand history rule. | +| `integration` | `CLAUDE.md` / `AGENTS.md` / `GEMINI.md` | Managed blocks injected or created. v0.16.0+ blocks open with **History contract (READ FIRST — violating this costs 2.5x tokens)** and list `chorus list / timeline / search` as the on-demand recall path. The block's support-commands list also enumerates `diff`, `audit-redactions`, `relevance`, `send`, and `messages`. | | `gitignore` | `.gitignore` | `.agent-chorus/` appended if not already present | | `plugin` | `claude plugin` | Auto-installs Claude Code skill plugin if `claude` CLI is available | +**Stale-snippet detection (v0.16.0):** `chorus doctor` emits +`snippet__stale: warn` and `integration__stale: warn` +when these files exist but were generated before the v0.16.0 history +contract was added. The remediation is `chorus setup --force`, which +refreshes the snippet and managed block in place. See "Doctor — Check +Catalogue" above. + **JSON output:** ```json @@ -886,7 +1077,7 @@ Doctor reports on: version, session directory availability, setup completeness ( ``` Agent Chorus doctor: PASS (/path/to/project) -- PASS version: agent-chorus v0.7.0 +- PASS version: agent-chorus v0.16.0 - PASS codex_sessions_dir: Found: ~/.codex/sessions - PASS claude_projects_dir: Found: ~/.claude/projects - PASS gemini_tmp_dir: Found: ~/.gemini/tmp @@ -894,19 +1085,82 @@ Agent Chorus doctor: PASS (/path/to/project) - PASS snippet_claude: Found: .agent-chorus/providers/claude.md - PASS integration_claude: Managed block present in CLAUDE.md - PASS sessions_claude: At least one claude session discovered +- PASS sessions_cursor_cli: At least one cursor-agent CLI transcript discovered +- INFO sessions_cursor_app: Cursor IDE not configured (data directory absent: ~/.cursor/chats) +- INFO sessions_hermes: Hermes not configured (data directory absent: ~/.hermes/sessions) - PASS context_pack_state: State: SEALED_VALID -- PASS update_status: Up to date (0.7.0) +- INFO context_pack_hooks_path: Effective git hooks path: .git/hooks (default) +- PASS context_pack_pre_push: Found: .git/hooks/pre-push +- PASS update_status: Up to date (0.16.0) - PASS claude_plugin: agent-chorus Claude Code plugin installed ``` -**JSON output (`--json`):** array of `{ id, status, detail }` check objects, where `status` is `"pass"`, `"warn"`, or `"fail"`. +**JSON output (`--json`):** object `{ cwd, overall, checks: [...] }` +where each check is `{ id, status, detail }`. `status` is one of +`"pass"`, `"info"`, `"warn"`, or `"fail"`. The top-level `overall` +collapses the checks (see severity model below). + +### Doctor — Severity Model (v0.16.0) + +Doctor returns four severity levels per check: + +| Severity | Meaning | Elevates `overall`? | +|---|---|---| +| `pass` | Check passed. | no | +| `info` | Informational state — typically "this feature is intentionally not configured" (e.g. Hermes not installed, cwd not a git repo). Distinguishable from `pass` for tooling that wants to surface configuration absence, but is NOT a problem. | **no** | +| `warn` | Soft failure — something is misconfigured but the install still works. Includes stale snippets, dangling env overrides, missing managed blocks on an initialized install. | yes (sets `overall: warn`) | +| `fail` | Hard failure — the install is broken or an adapter errored. | yes (sets `overall: fail`) | + +`overall` is computed as: +1. `fail` if any check is `fail`. +2. else `warn` if any check is `warn`. +3. else `pass`. (`info` never elevates `overall`.) + +This matters for CI: `chorus doctor --json | jq -e '.overall == "pass"'` +will succeed on an install that has `info`-tagged checks (e.g. "Hermes +not installed") and fail on `warn` or `fail`. + +### Doctor — Check Catalogue (v0.16.0) + +The set of check IDs and their possible severities. New or changed +entries in v0.16.0 are marked `[v0.16.0]`. + +| Check ID | Possible severities | What it reports | +|---|---|---| +| `version` | `pass` | The running `chorus` version. | +| `codex_sessions_dir` / `claude_projects_dir` / `gemini_tmp_dir` | `pass` / `warn` | Whether the agent's base directory exists. | +| `setup_intents` | `pass` / `warn` / `info` | Whether `.agent-chorus/INTENTS.md` exists. `info` when the repo is uninitialized; `warn` when the repo has been initialized but the intents file is missing. | +| `snippet_` | `pass` / `warn` / `info` | Whether the per-agent provider snippet (`.agent-chorus/providers/.md`) exists. Severity follows the same `info`-vs-`warn` rule as `setup_intents`. | +| `integration_` | `pass` / `warn` / `info` | Whether the managed block is injected in `AGENTS.md` / `CLAUDE.md` / `GEMINI.md`. | +| `sessions_codex` / `sessions_claude` / `sessions_gemini` | `pass` / `warn` / `fail` | Whether at least one session for the cwd was discovered. `fail` if the adapter errored. | +| `sessions_cursor_cli` `[v0.16.0]` | `pass` / `info` / `warn` | Cursor CLI (cursor-agent) transcript surface. `info` when the data directory is absent (tool not installed — intentional). `warn` when the directory exists but has zero sessions. Replaces the previous single `sessions_cursor` check. | +| `sessions_cursor_app` `[v0.16.0]` | `pass` / `info` / `warn` | Cursor IDE (desktop app) `store.db` surface. Same `info`-vs-`warn` rule. Replaces the previous single `sessions_cursor` check. | +| `sessions_hermes` `[v0.16.0]` | `pass` / `info` / `warn` | Hermes (provisional) sessions. Now downgrades to `info` when the data directory is absent (was `warn` in v0.15.0 — a noisy false positive for anyone who doesn't run Hermes). | +| `env_override_dangling` `[v0.16.0]` | `warn` | Emitted (potentially multiple times) for each `CHORUS_*` / `BRIDGE_*` env var that points at a non-existent directory. Sessions from that adapter would be invisible until the var is cleared or the directory exists; doctor surfaces it instead of silently hiding adapter output. | +| `snippet__stale` `[v0.16.0]` | `warn` | Emitted when `.agent-chorus/providers/.md` exists but lacks the load-bearing "History contract" section introduced in v0.16.0. Remediation: `chorus setup --force`. | +| `integration__stale` `[v0.16.0]` | `warn` | Emitted when the managed block inside `AGENTS.md` / `CLAUDE.md` / `GEMINI.md` exists but predates the v0.16.0 history contract. Remediation: `chorus setup --force`. | +| `context_pack_state` | `pass` / `warn` | `SEALED_VALID` / `TEMPLATE` / `UNINITIALIZED`. | +| `context_pack_guidance` | `warn` | Present only when the pack state is `UNINITIALIZED` or `TEMPLATE`. | +| `context_pack_hooks_path` `[v0.16.0]` | `info` | Reports the effective git hooks path (`configured` via `git config core.hooksPath`, else `default` = `.git/hooks`). When the cwd is **not a git repo**, this check reports `info` ("cwd is not a git repository; git hooks checks skipped") rather than falsely reporting a global hooks path as if it applied to this cwd. | +| `context_pack_pre_push` `[v0.16.0]` | `pass` / `warn` / `info` | Whether a pre-push hook exists at the effective hooks path. `info` when the cwd is not a git repo. | +| `update_status` | `pass` / `warn` | Update check result. `warn` if the update check itself errored. | +| `claude_plugin` | `pass` / `warn` | Claude Code plugin install state. `warn` if the `claude` CLI is missing or the plugin isn't installed. | + +**Why the `info` tier exists:** before v0.16.0, "Hermes not installed" +and "cwd is not a git repo" both rendered as `warn`, which polluted +`overall: warn` for installs that were fully healthy for their actual +use case. The `info` tier separates *intentional absence* from +*misconfiguration*. Tooling that only cares about real problems should +check `overall != "pass"`; tooling that wants to render full +configuration state should iterate the `checks` array and surface +`info` rows distinctly. **Exit codes** | Code | Condition | |---|---| -| `0` | All checks passed or warned (non-fatal) | -| non-zero | At least one check returned `fail`, or the doctor run itself errored before reporting | +| `0` | `overall` is `pass` or `warn` (and the doctor run completed). `info`-only installs exit `0` with `overall: pass`. | +| non-zero | At least one check returned `fail`, or the doctor run itself errored before reporting. | ## Teardown @@ -981,9 +1235,16 @@ Override default paths using environment variables. | `CHORUS_CODEX_SESSIONS_DIR` | Path to Codex sessions | `~/.codex/sessions` | | `CHORUS_GEMINI_TMP_DIR` | Path to Gemini temp chats | `~/.gemini/tmp` | | `CHORUS_CLAUDE_PROJECTS_DIR` | Path to Claude projects | `~/.claude/projects` | -| `CHORUS_CURSOR_DATA_DIR` | cursor-agent projects root | `~/.cursor/projects` | +| `CHORUS_CURSOR_DATA_DIR` | cursor-agent projects root (CLI surface) | `~/.cursor/projects` | +| `CHORUS_CURSOR_APP_DATA_DIR` | Cursor IDE chat store root (app surface, v0.16.0) | `~/.cursor/chats` | | `CHORUS_HERMES_DATA_DIR` | Hermes sessions (provisional) | `~/.hermes/sessions` | +Every `CHORUS_*` variable has a backward-compatible `BRIDGE_*` alias +(e.g. `BRIDGE_CURSOR_APP_DATA_DIR`). When both are set, `CHORUS_*` wins. +`chorus doctor` emits `env_override_dangling: warn` when any of these +points at a non-existent directory (see "Doctor — Severity Model" +below). + ## Agent-Specific Notes ### Gemini: protobuf (`.pb`) fallback @@ -1009,36 +1270,43 @@ For the full workaround including a JSONL-stub recipe, see [`docs/session-handoff-guide.md`](./session-handoff-guide.md) "Scenario 4 — Gemini protobuf fallback". -### Cursor: native cursor-agent transcripts - -Chorus reads Cursor sessions from the cursor-agent CLI transcript tree: - -`~/.cursor/projects//agent-transcripts//.jsonl` - -Per-session `--cwd` scoping is derived from `/.workspace-trusted` -(`workspacePath`, when present) or from a filesystem-validated demangle of the -project directory name. `--include-user` and `--tool-calls` are supported. - -Override the projects root with `CHORUS_CURSOR_DATA_DIR` (or legacy -`BRIDGE_CURSOR_DATA_DIR`). See `docs/adapters/CURSOR_HERMES_NATIVE_ADAPTER.md`. - -### Cursor: SQLite (`state.vscdb`) fallback - -When no cursor-agent transcripts are found but SQLite chat data exists under -`~/Library/Application Support/Cursor/User/workspaceStorage//state.vscdb` -(macOS; Linux/Windows use the equivalent application-support paths), `NOT_FOUND` -errors may mention "SQLite state.vscdb". Chorus does not parse that store yet. - -For inspection / debugging, you can dump the relevant rows manually: - -```bash -DB=~/Library/Application\ Support/Cursor/User/workspaceStorage//state.vscdb -sqlite3 "$DB" "SELECT key, length(value) FROM ItemTable WHERE key LIKE '%composer%';" -``` - -Full `rusqlite`-backed reading is tracked as a follow-up. See -[`docs/session-handoff-guide.md`](./session-handoff-guide.md) "Scenario -5 — Cursor SQLite fallback" for the full context. +### Cursor: two on-disk surfaces (CLI transcripts + IDE app store, v0.16.0) + +As of v0.16.0, Chorus reads Cursor sessions from **both** the cursor-agent CLI +transcript tree and the Cursor IDE (desktop app) chat store. The two surfaces +are independent; either can be empty without breaking the other. + +| Surface | Path | Format | Override env | +|---|---|---|---| +| `cli` | `~/.cursor/projects//agent-transcripts//.jsonl` | JSONL (one event per line) | `CHORUS_CURSOR_DATA_DIR` (legacy: `BRIDGE_CURSOR_DATA_DIR`) | +| `app` | `~/.cursor/chats///store.db` | SQLite | `CHORUS_CURSOR_APP_DATA_DIR` (legacy: `BRIDGE_CURSOR_APP_DATA_DIR`) | + +Both surfaces appear in the same `chorus list --agent cursor` / +`chorus search --agent cursor` results, distinguished by the cursor-only +`source: "cli" | "app"` field (documented under "Listing Sessions" +above). `chorus read --agent cursor` selects between them via `--id` +substring match like every other adapter. + +Per-session `--cwd` scoping for the CLI surface is derived from +`/.workspace-trusted` (`workspacePath`, when present) or from a +filesystem-validated demangle of the project directory name. The IDE +app surface scopes via the workspace path persisted in `store.db`. +`--include-user` is supported on both surfaces; `--tool-calls` runs +without error but does not currently surface `[TOOL: ...]` blocks (see +the tool-calls behaviour note in the read section). + +**Node runtime requirement (app surface only):** the IDE app surface +requires **Node >= 22.5** for the built-in `node:sqlite` module. +On older Node versions the Rust CLI still exposes the IDE app surface +(it links `rusqlite`), but the Node CLI gracefully falls back to +showing only the CLI/JSONL surface — `chorus list --agent cursor` +will simply omit `source: "app"` rows, and `chorus doctor` reports +"Cursor IDE SQLite reader unavailable (requires Node >= 22.5 with +node:sqlite)" rather than failing. This is intentional: degraded +visibility, not a hard error. + +See `docs/adapters/CURSOR_HERMES_NATIVE_ADAPTER.md` for the full adapter +architecture. ### Hermes (provisional scaffold) @@ -1091,9 +1359,55 @@ Chorus checks for updates once per version. - **Fail-silent**: If the check fails, it says nothing. - **Opt-out**: Set `CHORUS_SKIP_UPDATE_CHECK=1`. +## Unknown Flag Handling (F11, v0.16.0) + +Both runtimes now **fail closed on unknown flags**. The Rust CLI inherits +this behavior from clap. The Node CLI previously had a hand-rolled +parser that silently ignored unknown flags — typos like `--Json` (wrong +case) or `--limt` (transposed letters) used to fall through to default +behavior, producing surprising output. As of v0.16.0 the Node CLI +mirrors clap and rejects unknown flags by name: + +``` +$ chorus list --agent codex --limt 3 +Unknown flag for 'list': --limt. Run `chorus list --help` to see allowed flags. +``` + +The validator runs at dispatch time, before the command's own parser, so +the error names the offending flag and the subcommand explicitly. +`agent-context` and `trash-talk` have their own nested parsers; the +top-level validator passes through to them. + +The full per-command allowlist lives in +`scripts/read_session.cjs:ALLOWED_FLAGS`. A flag not in the allowlist is +rejected even if the underlying handler would have accepted it — the +allowlist is the contract. + ## Parity Notes -As of v0.13.0, Node and Rust have full parity across every supported subcommand: `read` (including `--include-user`, `--tool-calls`, and `--format {json|md|markdown}`), `list`, `search`, `compare`, `diff`, `summary`, `timeline`, `send`, `messages`, `checkpoint`, `setup`, `doctor`, `teardown`, `agent-context`, and `relevance`. All shared outputs are conformance-tested via `scripts/conformance.sh` against golden fixtures in `fixtures/golden/`. +As of v0.16.0, Node and Rust have full parity across every supported subcommand: `read` (including `--include-user`, `--tool-calls`, `--history`, and `--format {json|md|markdown}`), `list`, `search`, `compare`, `diff`, `summary`, `timeline`, `send`, `messages`, `checkpoint`, `setup`, `doctor`, `teardown`, `agent-context`, and `relevance`. All shared outputs are conformance-tested via `scripts/conformance.sh` against golden fixtures in `fixtures/golden/`. + +### Search Invariant (`read(text) ⊆ search(text-tokens)`, CI-enforced in v0.16.0) + +Every adapter must satisfy this invariant: if `chorus read --agent ` returns +content for a session, then `chorus search --agent ` +must return that session in its results. Conformance now enforces this for +every supported adapter — claude, codex, gemini, cursor (both CLI and IDE +app surfaces), and hermes — in `scripts/conformance.sh` (see the +`search-read-parity` block). + +The codex extractor was the original motivating bug: prior to v0.16.0 it +walked a top-level `role`/`content` schema that never existed in real +codex sessions, so the read path returned content from one envelope and +the search path indexed nothing — silently returning empty results. The +fix walks the real `response_item.payload.message` and +`event_msg.payload.message` envelopes that codex actually emits, so +read and search now operate on the same content. + +This invariant is what makes "evidence-based" claims auditable: a +consumer that quotes content from `chorus read` can verify the source +session is discoverable via `chorus search` without trusting the read +adapter blindly. Two documented wrinkles: diff --git a/fixtures/golden/doctor.json b/fixtures/golden/doctor.json index 2b37262..a0e345b 100644 --- a/fixtures/golden/doctor.json +++ b/fixtures/golden/doctor.json @@ -28,17 +28,17 @@ { "detail": "Missing provider instruction file: __PATH__", "id": "integration_claude", - "status": "warn" + "status": "info" }, { "detail": "Missing provider instruction file: __PATH__", "id": "integration_codex", - "status": "warn" + "status": "info" }, { "detail": "Missing provider instruction file: __PATH__", "id": "integration_gemini", - "status": "warn" + "status": "info" }, { "detail": "No claude sessions discovered", @@ -51,9 +51,14 @@ "status": "warn" }, { - "detail": "No cursor sessions discovered", - "id": "sessions_cursor", - "status": "warn" + "detail": "At least one Cursor IDE store.db discovered", + "id": "sessions_cursor_app", + "status": "pass" + }, + { + "detail": "At least one cursor-agent CLI transcript discovered", + "id": "sessions_cursor_cli", + "status": "pass" }, { "detail": "No gemini sessions discovered", @@ -61,29 +66,29 @@ "status": "warn" }, { - "detail": "No hermes sessions discovered", + "detail": "At least one hermes session discovered", "id": "sessions_hermes", - "status": "warn" + "status": "pass" }, { "detail": "Missing: __PATH__", "id": "setup_intents", - "status": "warn" + "status": "info" }, { "detail": "Missing: __PATH__", "id": "snippet_claude", - "status": "warn" + "status": "info" }, { "detail": "Missing: __PATH__", "id": "snippet_codex", - "status": "warn" + "status": "info" }, { "detail": "Missing: __PATH__", "id": "snippet_gemini", - "status": "warn" + "status": "info" }, { "detail": "__VERSION__", diff --git a/fixtures/golden/list-codex.json b/fixtures/golden/list-codex.json index 3f4d657..4c2d452 100644 --- a/fixtures/golden/list-codex.json +++ b/fixtures/golden/list-codex.json @@ -1,30 +1,37 @@ [ { - "session_id": "session-codex-malformed", + "session_id": "session-codex-fixture-0001", "agent": "codex", "cwd": "/workspace/demo", - "modified_at": "2026-02-08T11:52:16.062Z", - "file_path": "/tmp/session-codex-malformed.jsonl" + "modified_at": null, + "file_path": "session-codex-fixture-0001.jsonl" }, { - "session_id": "session-codex-multi", + "session_id": "session-codex-malformed", "agent": "codex", "cwd": "/workspace/demo", - "modified_at": "2026-02-08T11:29:55.014Z", - "file_path": "/tmp/session-codex-multi.jsonl" + "modified_at": null, + "file_path": "session-codex-malformed.jsonl" }, { "session_id": "session-codex-mixed-schema", "agent": "codex", "cwd": "/workspace/demo", - "modified_at": "2026-02-08T11:29:29.124Z", - "file_path": "/tmp/session-codex-mixed-schema.jsonl" + "modified_at": null, + "file_path": "session-codex-mixed-schema.jsonl" }, { - "session_id": "session-codex-fixture-0001", + "session_id": "session-codex-multi", + "agent": "codex", + "cwd": "/workspace/demo", + "modified_at": null, + "file_path": "session-codex-multi.jsonl" + }, + { + "session_id": "session-codex-tool-fixture", "agent": "codex", "cwd": "/workspace/demo", - "modified_at": "2026-02-08T08:34:50.021Z", - "file_path": "/tmp/session-codex-fixture-0001.jsonl" + "modified_at": null, + "file_path": "session-codex-tool-fixture.jsonl" } ] diff --git a/fixtures/golden/read-claude-tool-fixture.json b/fixtures/golden/read-claude-tool-fixture.json new file mode 100644 index 0000000..9e77e58 --- /dev/null +++ b/fixtures/golden/read-claude-tool-fixture.json @@ -0,0 +1,16 @@ +{ + "agent": "claude", + "chorus_output_version": 1, + "content": "Checking the file.\n[TOOL: Read]\n{\n \"path\": \"/workspace/demo/example.txt\"\n}\n[/TOOL]\nDone reading.", + "cwd": "/workspace/demo", + "included_roles": [ + "assistant" + ], + "included_tool_calls": true, + "message_count": 1, + "messages_returned": 1, + "session_id": "session-claude-tool-fixture", + "source": "session-claude-tool-fixture.jsonl", + "timestamp": "__TS__", + "warnings": [] +} diff --git a/fixtures/golden/read-codex-tool-fixture.json b/fixtures/golden/read-codex-tool-fixture.json new file mode 100644 index 0000000..b297f11 --- /dev/null +++ b/fixtures/golden/read-codex-tool-fixture.json @@ -0,0 +1,16 @@ +{ + "agent": "codex", + "chorus_output_version": 1, + "content": "Running a shell command.\n[TOOL: shell]\n{\"command\":\"ls /workspace/demo\"}\n[/TOOL]\nListed the directory.", + "cwd": "/workspace/demo", + "included_roles": [ + "assistant" + ], + "included_tool_calls": true, + "message_count": 1, + "messages_returned": 1, + "session_id": "session-codex-tool-fixture", + "source": "session-codex-tool-fixture.jsonl", + "timestamp": "__TS__", + "warnings": [] +} diff --git a/fixtures/golden/read-cursor-app-redaction.json b/fixtures/golden/read-cursor-app-redaction.json new file mode 100644 index 0000000..c23e7e0 --- /dev/null +++ b/fixtures/golden/read-cursor-app-redaction.json @@ -0,0 +1,15 @@ +{ + "agent": "cursor", + "chorus_output_version": 1, + "content": "Found token=[REDACTED] Also Bearer [REDACTED]", + "cwd": "/workspace/demo", + "included_roles": [ + "assistant" + ], + "message_count": 1, + "messages_returned": 1, + "session_id": "cursor-app-redaction-uuid", + "source": "store.db", + "timestamp": "__TS__", + "warnings": [] +} diff --git a/fixtures/golden/read-cursor-app-tool-calls.json b/fixtures/golden/read-cursor-app-tool-calls.json new file mode 100644 index 0000000..3884bec --- /dev/null +++ b/fixtures/golden/read-cursor-app-tool-calls.json @@ -0,0 +1,16 @@ +{ + "agent": "cursor", + "chorus_output_version": 1, + "content": "Second answer, with a [TOOL:Read] mark.\n[TOOL: Read]\n{\n \"path\": \"/tmp/x\"\n}\n[/TOOL]", + "cwd": "/workspace/demo", + "included_roles": [ + "assistant" + ], + "included_tool_calls": true, + "message_count": 2, + "messages_returned": 1, + "session_id": "cursor-app-fixture-uuid", + "source": "store.db", + "timestamp": "__TS__", + "warnings": [] +} diff --git a/fixtures/golden/read-cursor-app.json b/fixtures/golden/read-cursor-app.json new file mode 100644 index 0000000..fbaca36 --- /dev/null +++ b/fixtures/golden/read-cursor-app.json @@ -0,0 +1,15 @@ +{ + "agent": "cursor", + "chorus_output_version": 1, + "content": "Second answer, with a [TOOL:Read] mark.", + "cwd": "/workspace/demo", + "included_roles": [ + "assistant" + ], + "message_count": 2, + "messages_returned": 1, + "session_id": "cursor-app-fixture-uuid", + "source": "store.db", + "timestamp": "__TS__", + "warnings": [] +} diff --git a/fixtures/golden/timeline.json b/fixtures/golden/timeline.json index 4ef0a03..01f2544 100644 --- a/fixtures/golden/timeline.json +++ b/fixtures/golden/timeline.json @@ -3,7 +3,8 @@ "claude", "codex", "gemini", - "cursor" + "cursor", + "hermes" ], "chorus_output_version": 1, "cwd": "/workspace/demo", @@ -29,6 +30,13 @@ "snippet": "Here are some secrets: sk-[REDACTED] and AKIA[REDACTED] and Bearer [REDACTED] and api_key=[REDACTED] and token=[REDACTED] and password=[REDACTED] and apikey=[REDACTED] and secret=[REDACTED]", "timestamp": "__TS__" }, + { + "agent": "claude", + "cwd": "/workspace/demo", + "session_id": "session-claude-tool-fixture", + "snippet": "Checking the file.Done reading.", + "timestamp": "__TS__" + }, { "agent": "codex", "cwd": "/workspace/demo", @@ -57,6 +65,27 @@ "snippet": "Third assistant response.", "timestamp": "__TS__" }, + { + "agent": "codex", + "cwd": "/workspace/demo", + "session_id": "session-codex-tool-fixture", + "snippet": "Running a shell command.Listed the directory.", + "timestamp": "__TS__" + }, + { + "agent": "cursor", + "cwd": "/workspace/demo", + "session_id": "cursor-app-fixture-uuid", + "snippet": "Second answer, with a [TOOL:Read] mark.", + "timestamp": "__TS__" + }, + { + "agent": "cursor", + "cwd": "/workspace/demo", + "session_id": "cursor-app-redaction-uuid", + "snippet": "Found token=[REDACTED] Also Bearer [REDACTED]", + "timestamp": "__TS__" + }, { "agent": "cursor", "cwd": "/workspace/demo", @@ -98,6 +127,13 @@ "session_id": "session-gemini-jsonl-fixture", "snippet": "Second jsonl assistant answer.", "timestamp": "__TS__" + }, + { + "agent": "hermes", + "cwd": "/workspace/demo", + "session_id": "session-hermes-fixture", + "snippet": "Hermes fixture assistant output.", + "timestamp": "__TS__" } ], "warnings": [] diff --git a/fixtures/session-store/claude/projects/sample/session-claude-tool-fixture.jsonl b/fixtures/session-store/claude/projects/sample/session-claude-tool-fixture.jsonl new file mode 100644 index 0000000..d1f494f --- /dev/null +++ b/fixtures/session-store/claude/projects/sample/session-claude-tool-fixture.jsonl @@ -0,0 +1,2 @@ +{"cwd":"/workspace/demo"} +{"type":"assistant","message":{"role":"assistant","content":[{"type":"text","text":"Checking the file."},{"type":"tool_use","name":"Read","input":{"path":"/workspace/demo/example.txt"}},{"type":"text","text":"Done reading."}]}} diff --git a/fixtures/session-store/codex/sessions/2026/01/01/session-codex-tool-fixture.jsonl b/fixtures/session-store/codex/sessions/2026/01/01/session-codex-tool-fixture.jsonl new file mode 100644 index 0000000..251a41a --- /dev/null +++ b/fixtures/session-store/codex/sessions/2026/01/01/session-codex-tool-fixture.jsonl @@ -0,0 +1,2 @@ +{"type":"session_meta","payload":{"cwd":"/workspace/demo"}} +{"type":"response_item","payload":{"type":"message","role":"assistant","content":[{"type":"text","text":"Running a shell command."},{"type":"function_call","name":"shell","arguments":"{\"command\":\"ls /workspace/demo\"}"},{"type":"text","text":"Listed the directory."}]}} diff --git a/fixtures/session-store/cursor/chats/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/cursor-app-fixture-uuid/store.db b/fixtures/session-store/cursor/chats/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/cursor-app-fixture-uuid/store.db new file mode 100644 index 0000000..e22ccf4 Binary files /dev/null and b/fixtures/session-store/cursor/chats/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/cursor-app-fixture-uuid/store.db differ diff --git a/fixtures/session-store/cursor/chats/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb/cursor-app-redaction-uuid/store.db b/fixtures/session-store/cursor/chats/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb/cursor-app-redaction-uuid/store.db new file mode 100644 index 0000000..6d16a2a Binary files /dev/null and b/fixtures/session-store/cursor/chats/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb/cursor-app-redaction-uuid/store.db differ diff --git a/fixtures/session-store/hermes/sessions/session-hermes-fixture.jsonl b/fixtures/session-store/hermes/sessions/session-hermes-fixture.jsonl new file mode 100644 index 0000000..02f2fb7 --- /dev/null +++ b/fixtures/session-store/hermes/sessions/session-hermes-fixture.jsonl @@ -0,0 +1,2 @@ +{"role":"user","content":"Test question","cwd":"/workspace/demo"} +{"role":"assistant","content":"Hermes fixture assistant output.","cwd":"/workspace/demo"} diff --git a/package.json b/package.json index 82f9352..26e6493 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "agent-chorus", - "version": "0.15.0", + "version": "0.16.0", "description": "Local-first CLI to read, compare, and hand off context across Codex, Claude, Gemini, and Cursor sessions.", "keywords": [ "agent-chorus", diff --git a/schemas/list-output.schema.json b/schemas/list-output.schema.json index 5346b13..4cb77e3 100644 --- a/schemas/list-output.schema.json +++ b/schemas/list-output.schema.json @@ -30,6 +30,11 @@ "scope_hash": { "type": "string", "description": "For Gemini sessions only: the hex-hash scope directory (SHA-256 of an absolute cwd) when the scope segment is not a human-named directory. Present only when cwd inference fell back to a hex scope; absent otherwise." + }, + "source": { + "type": "string", + "enum": ["cli", "app"], + "description": "For Cursor sessions only: which on-disk surface the session was discovered from. 'cli' = cursor-agent transcripts under ~/.cursor/projects; 'app' = Cursor IDE store.db under ~/.cursor/chats. Absent for agents with a single surface." } } } diff --git a/schemas/read-output.schema.json b/schemas/read-output.schema.json index 8d3ba8b..e0e4975 100644 --- a/schemas/read-output.schema.json +++ b/schemas/read-output.schema.json @@ -52,6 +52,10 @@ "type": "boolean", "description": "Whether tool call content is included (present when --tool-calls is used)" }, + "cwd_mismatch": { + "type": "boolean", + "description": "Present and true when --cwd was passed but no session matched, and the adapter fell back to the latest session regardless. Consumers that parse the JSON without scanning the warnings array can use this boolean to detect the silent-fallback case. Absent when no fallback occurred." + }, "redactions": { "type": "array", "description": "Redaction audit trail (present when --audit-redactions is used)", diff --git a/scripts/adapters/codex.cjs b/scripts/adapters/codex.cjs index 32379d1..b25d31d 100644 --- a/scripts/adapters/codex.cjs +++ b/scripts/adapters/codex.cjs @@ -198,14 +198,31 @@ function search(query, cwd, limit) { continue; } + // Codex stores messages in two nested envelopes: + // - {type:"response_item", payload:{type:"message", role:"assistant", + // content:[{text:"..."}, ...]}} + // - {type:"event_msg", payload:{type:"agent_message", message:"..."}} + // The previous shape (role/content at top level) never existed in any + // real codex session and produced an unconditionally empty result for + // search — UAT P3. Aligns with parse path in read(). let assistantText = ''; try { const lines = readJsonlLines(f.path); for (const line of lines) { try { const obj = JSON.parse(line); - if (obj.role === 'assistant' && obj.content) { - assistantText += (typeof obj.content === 'string' ? obj.content : JSON.stringify(obj.content)) + '\n'; + const envType = obj && obj.type; + const payload = obj && obj.payload; + if (envType === 'response_item' && payload + && payload.type === 'message' && payload.role === 'assistant') { + const t = extractText(payload.content); + if (t) assistantText += t + '\n'; + } else if (envType === 'event_msg' && payload + && payload.type === 'agent_message') { + const t = typeof payload.message === 'string' + ? payload.message + : extractText(payload.message); + if (t) assistantText += t + '\n'; } } catch (_e) { /* skip */ } } diff --git a/scripts/adapters/cursor.cjs b/scripts/adapters/cursor.cjs index 520fed5..4246682 100644 --- a/scripts/adapters/cursor.cjs +++ b/scripts/adapters/cursor.cjs @@ -16,6 +16,7 @@ const { } = require('./utils.cjs'); const { resolveCursorCwd } = require('./cursor_cwd.cjs'); const { readCursorTurns } = require('./cursor_parse.cjs'); +const cursorApp = require('./cursor_app.cjs'); // Build turns for the read path. Text-only by default; with --tool-calls, render // tool_use/tool_result segments too. Cursor's content array matches the shape @@ -91,30 +92,68 @@ function selectConversationTurns(turns, lastN) { } function resolve(id, cwd, opts) { - if (!fs.existsSync(cursorDataBase)) return null; - const files = collectCursorTranscripts(id); - if (files.length === 0) return null; + // Assemble candidates from BOTH surfaces (cursor-agent CLI JSONL + + // Cursor IDE store.db), newest-first, then pick by id or cwd-match. + // Each candidate carries the surface tag so `read` knows which reader + // to dispatch to. + const cliFiles = fs.existsSync(cursorDataBase) ? collectCursorTranscripts(id) : []; + const appBase = cursorApp.cursorAppBaseDir(); + const appSessions = cursorApp.collectCursorAppSessions(appBase); + + const candidates = []; + for (const f of cliFiles) { + candidates.push({ + surface: 'cli', + path: f.path, + mtime: getFileTimestamp(f.path), + resolveCwd: () => resolveCursorCwd(f.path), + }); + } + for (const s of appSessions) { + if (id && !s.agent_id.includes(id) && !s.db_path.includes(id)) continue; + candidates.push({ + surface: 'app', + path: s.db_path, + agent_id: s.agent_id, + mtime: cursorApp.cursorAppModifiedIso(s.db_path), + resolveCwd: () => cursorApp.cursorAppSessionWorkspace(s.db_path), + }); + } + if (candidates.length === 0) return null; + candidates.sort((a, b) => String(b.mtime || '').localeCompare(String(a.mtime || ''))); const warnings = []; - let targetPath; + let target; if (id) { - targetPath = files[0].path; + target = candidates[0]; } else if (cwd) { - targetPath = findLatestByCwd(files, resolveCursorCwd, cwd); - if (!targetPath) { + target = candidates.find((c) => { + const sc = c.resolveCwd(); + return sc && cwdMatchesProject(sc, cwd); + }); + if (!target) { warnings.push(`No Cursor session matched cwd ${normalizePath(cwd)}; falling back to latest session.`); - targetPath = files[0].path; + target = candidates[0]; } } else { - targetPath = files[0].path; + target = candidates[0]; } - return { path: targetPath, warnings }; + return target.surface === 'app' + ? { path: target.path, surface: 'app', agent_id: target.agent_id, warnings } + : { path: target.path, surface: 'cli', warnings }; } function read(filePath, lastN, opts = {}) { lastN = lastN || 1; - const turns = readCursorTurnsRich(filePath, opts.includeToolCalls === true); + // Surface detection: paths ending in `store.db` are Cursor IDE sessions; + // anything else is a cursor-agent CLI JSONL transcript. + const isApp = filePath.endsWith('/store.db') || filePath.endsWith(path.sep + 'store.db'); + + const turns = isApp + ? cursorApp.readCursorAppTurns(filePath, opts.includeToolCalls === true) + : readCursorTurnsRich(filePath, opts.includeToolCalls === true); + const assistantMsgs = turns.filter(t => t.role === 'assistant').map(t => t.text); const messageCount = assistantMsgs.length; @@ -138,8 +177,21 @@ function read(filePath, lastN, opts = {}) { messagesReturned = 0; } - const sessionId = path.basename(filePath, path.extname(filePath)); - const sessionCwd = resolveCursorCwd(filePath); + let sessionId; + let sessionCwd; + let timestamp; + if (isApp) { + // For Cursor IDE sessions, session_id is the UUID directory (matches the + // `agentId` field stored in meta); cwd comes from the embedded + // Workspace Path header. + sessionId = path.basename(path.dirname(filePath)); + sessionCwd = cursorApp.cursorAppSessionWorkspace(filePath); + timestamp = cursorApp.cursorAppModifiedIso(filePath); + } else { + sessionId = path.basename(filePath, path.extname(filePath)); + sessionCwd = resolveCursorCwd(filePath); + timestamp = getFileTimestamp(filePath); + } return { agent: 'cursor', @@ -148,7 +200,7 @@ function read(filePath, lastN, opts = {}) { warnings: [], session_id: sessionId, cwd: sessionCwd || null, - timestamp: getFileTimestamp(filePath), + timestamp, message_count: messageCount, messages_returned: messagesReturned, included_roles: rolesIncluded, @@ -158,69 +210,113 @@ function read(filePath, lastN, opts = {}) { function list(cwd, limit) { limit = limit || 10; - if (!fs.existsSync(cursorDataBase)) return []; - - const files = collectCursorTranscripts(null); const entries = []; - for (const f of files) { - if (entries.length >= limit) break; - const sessionCwd = resolveCursorCwd(f.path); - if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) { - continue; + // Surface 1: cursor-agent CLI JSONL. + if (fs.existsSync(cursorDataBase)) { + const files = collectCursorTranscripts(null); + for (const f of files) { + const sessionCwd = resolveCursorCwd(f.path); + if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) continue; + entries.push({ + session_id: path.basename(f.path, path.extname(f.path)), + agent: 'cursor', + source: 'cli', + cwd: sessionCwd || null, + modified_at: getFileTimestamp(f.path), + file_path: f.path, + }); } + } - entries.push({ - session_id: path.basename(f.path, path.extname(f.path)), - agent: 'cursor', - cwd: sessionCwd || null, - modified_at: getFileTimestamp(f.path), - file_path: f.path, - }); + // Surface 2: Cursor IDE store.db. + const appBase = cursorApp.cursorAppBaseDir(); + if (fs.existsSync(appBase)) { + const sessions = cursorApp.collectCursorAppSessions(appBase); + for (const s of sessions) { + const sessionCwd = cursorApp.cursorAppSessionWorkspace(s.db_path); + if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) continue; + entries.push({ + session_id: s.agent_id, + agent: 'cursor', + source: 'app', + cwd: sessionCwd || null, + modified_at: cursorApp.cursorAppModifiedIso(s.db_path), + file_path: s.db_path, + }); + } } - return entries; + + // Newest-first across both surfaces, then truncate. + entries.sort((a, b) => String(b.modified_at || '').localeCompare(String(a.modified_at || ''))); + return entries.slice(0, limit); } function search(query, cwd, limit) { limit = limit || 10; const queryLower = String(query || '').toLowerCase(); - if (!fs.existsSync(cursorDataBase)) return []; - - const files = collectCursorTranscripts(null); const entries = []; - for (const f of files) { - if (entries.length >= limit) break; - - const sessionCwd = resolveCursorCwd(f.path); - if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) { - continue; + // Surface 1: cursor-agent CLI JSONL. + if (fs.existsSync(cursorDataBase)) { + const files = collectCursorTranscripts(null); + for (const f of files) { + const sessionCwd = resolveCursorCwd(f.path); + if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) continue; + const assistantText = readCursorTurns(f.path) + .filter(t => t.role === 'assistant') + .map(t => t.text) + .join('\n'); + const lower = assistantText.toLowerCase(); + if (!lower.includes(queryLower)) continue; + const idx = lower.indexOf(queryLower); + const snippetStart = Math.max(0, idx - 60); + const snippetEnd = Math.min(assistantText.length, idx + queryLower.length + 60); + const match_snippet = assistantText.slice(snippetStart, snippetEnd).replace(/\n/g, ' '); + entries.push({ + session_id: path.basename(f.path, path.extname(f.path)), + agent: 'cursor', + source: 'cli', + cwd: sessionCwd || null, + modified_at: getFileTimestamp(f.path), + file_path: f.path, + match_snippet, + }); } + } - const assistantText = readCursorTurns(f.path) - .filter(t => t.role === 'assistant') - .map(t => t.text) - .join('\n'); - - const lower = assistantText.toLowerCase(); - if (!lower.includes(queryLower)) continue; - - const idx = lower.indexOf(queryLower); - const snippetStart = Math.max(0, idx - 60); - const snippetEnd = Math.min(assistantText.length, idx + queryLower.length + 60); - const match_snippet = assistantText.slice(snippetStart, snippetEnd).replace(/\n/g, ' '); - - entries.push({ - session_id: path.basename(f.path, path.extname(f.path)), - agent: 'cursor', - cwd: sessionCwd || null, - modified_at: getFileTimestamp(f.path), - file_path: f.path, - match_snippet, - }); + // Surface 2: Cursor IDE store.db. + const appBase = cursorApp.cursorAppBaseDir(); + if (fs.existsSync(appBase)) { + const sessions = cursorApp.collectCursorAppSessions(appBase); + for (const s of sessions) { + const sessionCwd = cursorApp.cursorAppSessionWorkspace(s.db_path); + if (cwd && !(sessionCwd && cwdMatchesProject(sessionCwd, cwd))) continue; + const assistantText = cursorApp.readCursorAppTurns(s.db_path, false) + .filter(t => t.role === 'assistant') + .map(t => t.text) + .join('\n'); + const lower = assistantText.toLowerCase(); + if (!lower.includes(queryLower)) continue; + const idx = lower.indexOf(queryLower); + const snippetStart = Math.max(0, idx - 60); + const snippetEnd = Math.min(assistantText.length, idx + queryLower.length + 60); + const match_snippet = assistantText.slice(snippetStart, snippetEnd).replace(/\n/g, ' '); + entries.push({ + session_id: s.agent_id, + agent: 'cursor', + source: 'app', + cwd: sessionCwd || null, + modified_at: cursorApp.cursorAppModifiedIso(s.db_path), + file_path: s.db_path, + match_snippet, + }); + } } - return entries; + // Newest-first, truncate. + entries.sort((a, b) => String(b.modified_at || '').localeCompare(String(a.modified_at || ''))); + return entries.slice(0, limit); } module.exports = { resolve, read, list, search }; diff --git a/scripts/adapters/cursor_app.cjs b/scripts/adapters/cursor_app.cjs new file mode 100644 index 0000000..7d59b78 --- /dev/null +++ b/scripts/adapters/cursor_app.cjs @@ -0,0 +1,313 @@ +/** + * Cursor IDE (app) adapter — reads sessions stored as SQLite databases. + * + * Mirrors `cli/src/cursor_app.rs`. See that file's module-level doc for + * the full format specification (meta + blobs tables, hex-encoded JSON in + * `meta.value`, protobuf-style root blob enumerating child message SHAs, + * claude-shaped message JSON, "Workspace Path:" header for cwd recovery). + * + * On-disk layout: + * ~/.cursor/chats///store.db (SQLite) + * + * Override the base directory via `CHORUS_CURSOR_APP_DATA_DIR` or + * `BRIDGE_CURSOR_APP_DATA_DIR` (bridge fallback for backward compat). + * + * SQLite access uses Node's built-in `node:sqlite` (Node >= 22.5). On + * older Node, this module gracefully returns no sessions and the rest of + * the cursor adapter falls back to JSONL-only behavior — same end state + * as a user without Cursor IDE installed. Doctor surfaces the gap via + * the `sessions_cursor_app` check. + */ + +const fs = require('fs'); +const path = require('path'); +const { normalizePath } = require('./utils.cjs'); + +// F10: node:sqlite is experimental and Node emits an ExperimentalWarning +// the first time it's required (or sometimes asynchronously when first +// used). The default warning listener writes to stderr, which spams every +// chorus invocation that touches the cursor adapter. Replace the default +// listener with one that filters out specifically the SQLite warning +// while forwarding every other warning category through. Doing this once +// at module load (before the require) is what makes it stick across both +// synchronous and async emit paths. +const NODE_DEFAULT_WARNING_LISTENER = process.listeners('warning').slice(); +process.removeAllListeners('warning'); +process.on('warning', (warning) => { + if (warning + && warning.name === 'ExperimentalWarning' + && typeof warning.message === 'string' + && warning.message.includes('SQLite')) { + return; + } + for (const listener of NODE_DEFAULT_WARNING_LISTENER) { + try { listener(warning); } catch (_err) { /* swallow */ } + } +}); + +// Optional dependency: Node 22.5+ ships node:sqlite as experimental. +// Older Node returns null and the app surface is invisible (graceful). +let nodeSqlite = null; +try { + // eslint-disable-next-line global-require + nodeSqlite = require('node:sqlite'); +} catch (_err) { + nodeSqlite = null; +} + +function cursorAppBaseDir() { + return normalizePath( + process.env.CHORUS_CURSOR_APP_DATA_DIR + || process.env.BRIDGE_CURSOR_APP_DATA_DIR + || '~/.cursor/chats', + ); +} + +function isSqliteAvailable() { + return nodeSqlite !== null && typeof nodeSqlite.DatabaseSync === 'function'; +} + +function openDb(dbPath) { + if (!isSqliteAvailable()) return null; + try { + return new nodeSqlite.DatabaseSync(dbPath, { readOnly: true }); + } catch (_err) { + return null; + } +} + +function decodeHex(hex) { + if (typeof hex !== 'string' || hex.length % 2 !== 0) return null; + try { + return Buffer.from(hex, 'hex'); + } catch (_err) { + return null; + } +} + +function readMetaJson(db) { + try { + const row = db.prepare('SELECT value FROM meta LIMIT 1').get(); + if (!row || typeof row.value !== 'string') return null; + const buf = decodeHex(row.value); + if (!buf) return null; + return JSON.parse(buf.toString('utf8')); + } catch (_err) { + return null; + } +} + +function readBlob(db, id) { + try { + const row = db.prepare('SELECT data FROM blobs WHERE id = ?').get(id); + if (!row || !row.data) return null; + return Buffer.isBuffer(row.data) ? row.data : Buffer.from(row.data); + } catch (_err) { + return null; + } +} + +// Parse a protobuf-like length-delimited stream. Accepts any wire-type-2 +// field whose payload is exactly 32 bytes (SHA-256 of a child blob); +// skips other wire types / payload sizes for forward compatibility. +function parseRootBlobChain(buf) { + const out = []; + let i = 0; + while (i < buf.length) { + const tag = readVarint(buf, i); + if (!tag) break; + i = tag.next; + const wireType = tag.value & 0x07; + if (wireType === 2) { + const lenInfo = readVarint(buf, i); + if (!lenInfo) break; + i = lenInfo.next; + const payloadLen = Number(lenInfo.value); + if (i + payloadLen > buf.length) break; + if (payloadLen === 32) { + out.push(buf.slice(i, i + payloadLen).toString('hex')); + } + i += payloadLen; + } else if (wireType === 0) { + const v = readVarint(buf, i); + if (!v) break; + i = v.next; + } else if (wireType === 1) { + i += 8; + } else if (wireType === 5) { + i += 4; + } else { + break; + } + } + return out; +} + +function readVarint(buf, start) { + let result = 0n; + let shift = 0n; + for (let i = 0; i < 10; i += 1) { + if (start + i >= buf.length) return null; + const byte = buf[start + i]; + result |= BigInt(byte & 0x7f) << shift; + if ((byte & 0x80) === 0) { + return { value: Number(result), next: start + i + 1 }; + } + shift += 7n; + } + return null; +} + +function extractTextOnly(content) { + if (typeof content === 'string') return content; + if (Array.isArray(content)) { + const parts = []; + for (const seg of content) { + if (seg && seg.type === 'text' && typeof seg.text === 'string') { + parts.push(seg.text); + } + } + return parts.join('\n'); + } + return ''; +} + +/** + * Enumerate Cursor IDE sessions under the chats base. Returns one entry + * per discoverable store.db, newest mtime first. Returns [] when Node + * lacks node:sqlite or the base directory is absent. + */ +function collectCursorAppSessions(base = cursorAppBaseDir()) { + if (!isSqliteAvailable() || !fs.existsSync(base)) return []; + const out = []; + let hashDirs; + try { + hashDirs = fs.readdirSync(base, { withFileTypes: true }); + } catch (_err) { + return out; + } + for (const hashEntry of hashDirs) { + if (!hashEntry.isDirectory()) continue; + const hashDir = path.join(base, hashEntry.name); + let uuidDirs; + try { + uuidDirs = fs.readdirSync(hashDir, { withFileTypes: true }); + } catch (_err) { + continue; + } + for (const uuidEntry of uuidDirs) { + if (!uuidEntry.isDirectory()) continue; + const uuidDir = path.join(hashDir, uuidEntry.name); + const dbPath = path.join(uuidDir, 'store.db'); + try { + if (!fs.statSync(dbPath).isFile()) continue; + } catch (_err) { + continue; + } + const db = openDb(dbPath); + if (!db) continue; + const meta = readMetaJson(db); + try { db.close(); } catch (_err) {} + if (!meta || typeof meta.agentId !== 'string') continue; + out.push({ + agent_id: meta.agentId, + db_path: dbPath, + name: meta.name || null, + mode: meta.mode || null, + created_at_ms: typeof meta.createdAt === 'number' ? meta.createdAt : null, + }); + } + } + out.sort((a, b) => mtime(b.db_path) - mtime(a.db_path)); + return out; +} + +function mtime(p) { + try { + return fs.statSync(p).mtime.getTime(); + } catch (_err) { + return 0; + } +} + +/** + * Read all conversation turns from a Cursor IDE store.db, in order. + * Returns [{role, text}, ...]. Returns [] on any failure. + */ +function readCursorAppTurns(dbPath, includeToolCalls) { + const db = openDb(dbPath); + if (!db) return []; + try { + const meta = readMetaJson(db); + if (!meta || typeof meta.latestRootBlobId !== 'string') return []; + const root = readBlob(db, meta.latestRootBlobId); + if (!root) return []; + const childIds = parseRootBlobChain(root); + const turns = []; + + // Reuse the shared content extractor when --tool-calls is requested + // so cursor IDE output renders tool_use/tool_result identical to the + // cursor-agent CLI and claude paths. Required for invariant 1 (Node/Rust + // parity) and invariant 4 (boundary markers + version). + const { extractContentWithToolCalls } = require('./utils.cjs'); + + for (const id of childIds) { + const data = readBlob(db, id); + if (!data) continue; + let v; + try { + v = JSON.parse(data.toString('utf8')); + } catch (_err) { + continue; + } + const role = v && v.role; + if (role !== 'user' && role !== 'assistant') continue; + const text = includeToolCalls + ? extractContentWithToolCalls(v.content) + : extractTextOnly(v.content); + const trimmed = (text || '').trim(); + if (!trimmed) continue; + turns.push({ role, text: trimmed }); + } + return turns; + } finally { + try { db.close(); } catch (_err) {} + } +} + +/** + * Recover the workspace path embedded in the first user-role message's + * `Workspace Path: ` header. Returns null if not discoverable. + */ +function cursorAppSessionWorkspace(dbPath) { + const turns = readCursorAppTurns(dbPath, false); + for (const t of turns) { + if (t.role !== 'user') continue; + for (const line of t.text.split('\n')) { + const trimmed = line.replace(/^\s+/, ''); + if (trimmed.startsWith('Workspace Path:')) { + const value = trimmed.slice('Workspace Path:'.length).trim(); + if (value) return value; + } + } + } + return null; +} + +function cursorAppModifiedIso(dbPath) { + try { + const t = fs.statSync(dbPath).mtime; + return t.toISOString(); + } catch (_err) { + return null; + } +} + +module.exports = { + cursorAppBaseDir, + isSqliteAvailable, + collectCursorAppSessions, + readCursorAppTurns, + cursorAppSessionWorkspace, + cursorAppModifiedIso, +}; diff --git a/scripts/adapters/gemini.cjs b/scripts/adapters/gemini.cjs index a4705ab..8093ced 100644 --- a/scripts/adapters/gemini.cjs +++ b/scripts/adapters/gemini.cjs @@ -211,6 +211,11 @@ function readJsonl(filePath, lastN, opts = {}) { message_count: messageCount, messages_returned: messagesReturned, included_roles: rolesIncluded, + // N6: gemini sessions carry no tool calls, but when --tool-calls is + // passed we still ack the flag so the field is uniformly present + // across agents. The "no tool calls available" warning is emitted by + // the dispatcher in scripts/read_session.cjs. + ...(opts.includeToolCalls ? { included_tool_calls: true } : {}), }; } @@ -304,6 +309,7 @@ function read(filePath, lastN, opts = {}) { message_count: messageCount, messages_returned: messagesReturned, included_roles: rolesIncluded, + ...(opts.includeToolCalls ? { included_tool_calls: true } : {}), }; } diff --git a/scripts/adapters/hermes.cjs b/scripts/adapters/hermes.cjs index 2d53039..943e27d 100644 --- a/scripts/adapters/hermes.cjs +++ b/scripts/adapters/hermes.cjs @@ -143,6 +143,10 @@ function read(filePath, lastN, opts = {}) { message_count: messageCount, messages_returned: messagesReturned, included_roles: rolesIncluded, + // N6: provisional hermes adapter; format not yet confirmed to carry + // tool calls. Ack the flag for uniform output; dispatcher emits the + // "not available" warning. + ...(opts.includeToolCalls ? { included_tool_calls: true } : {}), }; } diff --git a/scripts/conformance.sh b/scripts/conformance.sh index 98b926c..32a3976 100755 --- a/scripts/conformance.sh +++ b/scripts/conformance.sh @@ -27,12 +27,16 @@ run_read_case() { CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ "${node_cmd[@]}" > "$node_out" CHORUS_CODEX_SESSIONS_DIR="$STORE/codex/sessions" \ CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ "${rust_cmd[@]}" > "$rust_out" node "$ROOT/scripts/compare_read_output.cjs" "$node_out" "$rust_out" "read-${label}" @@ -202,12 +206,16 @@ run_parity_case() { CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ node "$ROOT/scripts/read_session.cjs" "${node_args[@]}" > "$node_out" CHORUS_CODEX_SESSIONS_DIR="$STORE/codex/sessions" \ CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ cargo run --quiet --manifest-path "$ROOT/cli/Cargo.toml" -- "${rust_args[@]}" > "$rust_out" node "$SCRUB" "$node_out" "$node_scrubbed" "$kind" @@ -308,6 +316,77 @@ run_parity_case read read-cursor-tool-calls read-cursor-tool-calls.json \ read --agent=cursor --id=session-cursor-tool-calls --tool-calls --json :: \ read --agent cursor --id session-cursor-tool-calls --tool-calls --json +# --- read --agent cursor against the IDE SQLite fixture (N1) --- +# Exercises ~/.cursor/chats///store.db reading and +# the Workspace Path cwd-recovery path. Both surfaces (CLI JSONL + IDE +# SQLite) are wired into the cursor adapter; this case proves the SQLite +# leg works at parity. Requires Node >= 22.5 (built-in node:sqlite). +run_parity_case read read-cursor-app read-cursor-app.json \ + read --agent=cursor --id=cursor-app-fixture-uuid --json :: \ + read --agent cursor --id cursor-app-fixture-uuid --json + +# F8: redaction must apply to SQLite-sourced content the same way it +# applies to JSONL content. This fixture embeds an API-token-shaped +# string and a Bearer-token-shaped string inside cursor IDE blobs; +# both runtimes' redaction pipeline must replace them with [REDACTED]. +# See research/uat-replay-followups-2026-06-03.md F8. +run_parity_case read read-cursor-app-redaction read-cursor-app-redaction.json \ + read --agent=cursor --id=cursor-app-redaction-uuid --json :: \ + read --agent cursor --id cursor-app-redaction-uuid --json + +# F6: dedicated tool-call fixtures for claude and codex. The existing +# `claude-fixture` / `codex-fixture` underlying sessions contain no +# `tool_use` / `function_call` entries, so `read-{claude,codex}-tool-calls` +# only proves the flag is honored — not that the renderer emits +# `[TOOL: ]` blocks. These cases exercise the renderer for real. +# See research/uat-replay-followups-2026-06-03.md F6. +run_parity_case read read-claude-tool-fixture read-claude-tool-fixture.json \ + read --agent=claude --id=claude-tool-fixture --tool-calls --json :: \ + read --agent claude --id claude-tool-fixture --tool-calls --json + +run_parity_case read read-codex-tool-fixture read-codex-tool-fixture.json \ + read --agent=codex --id=codex-tool-fixture --tool-calls --json :: \ + read --agent codex --id codex-tool-fixture --tool-calls --json + +# N6: --tool-calls parity for cursor IDE (app) sessions. Cursor's content +# array uses the same {type:"text"} / {type:"tool_use"} / {type:"tool_result"} +# shape as Claude, so the existing extractor renders them at parity. +run_parity_case read read-cursor-app-tool-calls read-cursor-app-tool-calls.json \ + read --agent=cursor --id=cursor-app-fixture-uuid --tool-calls --json :: \ + read --agent cursor --id cursor-app-fixture-uuid --tool-calls --json + +# N6: agents whose transcript format has no tool-call concept (gemini, +# hermes) MUST emit a uniform warning when --tool-calls is requested +# rather than silently returning content unchanged. Mirrors +# `agent_has_no_tool_calls` in Rust and AGENTS_WITHOUT_TOOL_CALLS in Node. +run_parity_case read read-gemini-tool-calls "" \ + read --agent=gemini --id=gemini-fixture --tool-calls --json :: \ + read --agent gemini --id gemini-fixture --tool-calls --json + +# F7: hermes is the second adapter in the AGENTS_WITHOUT_TOOL_CALLS set +# (alongside gemini). Pre-fix there was no live hermes fixture so the +# uniform no-tool-calls warning path was logically asserted only via +# code review. The synthetic fixture under +# fixtures/session-store/hermes/sessions/ exercises it end-to-end. +run_parity_case read read-hermes "" \ + read --agent=hermes --id=hermes-fixture --json :: \ + read --agent hermes --id hermes-fixture --json + +run_parity_case read read-hermes-tool-calls "" \ + read --agent=hermes --id=hermes-fixture --tool-calls --json :: \ + read --agent hermes --id hermes-fixture --tool-calls --json + +# N7: --history flag scaffolds the on-demand default contract. `eager` +# emits a uniform warning; `none` zeros content (alias for +# --metadata-only). Both must be byte-identical across runtimes. +run_parity_case read read-codex-history-eager "" \ + read --agent=codex --id=codex-fixture --history=eager --json :: \ + read --agent codex --id codex-fixture --history eager --json + +run_parity_case read read-codex-history-none "" \ + read --agent=codex --id=codex-fixture --history=none --json :: \ + read --agent codex --id codex-fixture --history none --json + run_read_case codex codex-fixture Codex run_read_case gemini gemini-fixture Gemini run_read_case claude claude-fixture Claude @@ -317,6 +396,57 @@ run_report_case run_list_case codex Codex /workspace/demo run_search_case codex Codex "Codex fixture assistant output." /workspace/demo +# N2 regression: search extractor must see the same assistant text that +# read sees. Pre-fix, codex search returned empty because it walked a +# top-level role/content schema that codex sessions don't use (the real +# format is response_item.payload.message). Asserts the invariant +# read(text) ⊆ search(text-tokens) for both runtimes. +run_search_read_parity() { + local agent="$1" + local id="$2" + local query="$3" + + CHORUS_CODEX_SESSIONS_DIR="$STORE/codex/sessions" \ + CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ + CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ + CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ + node "$ROOT/scripts/read_session.cjs" search "$query" --agent="$agent" --json > "$TMP_DIR/srp-${agent}-node.json" + + CHORUS_CODEX_SESSIONS_DIR="$STORE/codex/sessions" \ + CHORUS_GEMINI_TMP_DIR="$STORE/gemini/tmp" \ + CHORUS_CLAUDE_PROJECTS_DIR="$STORE/claude/projects" \ + CHORUS_CURSOR_DATA_DIR="$STORE/cursor/projects" \ + CHORUS_CURSOR_APP_DATA_DIR="$STORE/cursor/chats" \ + CHORUS_HERMES_DATA_DIR="$STORE/hermes/sessions" \ + cargo run --quiet --manifest-path "$ROOT/cli/Cargo.toml" -- search "$query" --agent "$agent" --json > "$TMP_DIR/srp-${agent}-rust.json" + + for runtime in node rust; do + local out="$TMP_DIR/srp-${agent}-${runtime}.json" + if ! node -e " + const rows = JSON.parse(require('fs').readFileSync(process.argv[1], 'utf-8')); + const hit = rows.find(r => r.session_id && r.session_id.includes(process.argv[2])); + if (!hit) { console.error('FAIL search-read-parity ${agent} ${runtime}: search did not surface ${id} for query ${query}'); process.exit(1); } + " "$out" "$id"; then + exit 1 + fi + done + echo "PASS search-read-parity ${agent} (read(text) ⊆ search(text-tokens))" +} + +run_search_read_parity codex codex-fixture "Codex fixture assistant output" + +# F4: extend the invariant to every adapter, not just codex. Each agent's +# search extractor must surface the assistant text that read returns. +# Pre-fix only codex was covered; the same class of bug could silently +# regress claude, gemini, cursor (CLI or app) without notice. See +# research/uat-replay-followups-2026-06-03.md F4. +run_search_read_parity claude claude-fixture "Claude fixture assistant output" +run_search_read_parity gemini gemini-fixture "Gemini fixture assistant output" +run_search_read_parity cursor session-cursor-fixture-0001 "Cursor fixture assistant output" +run_search_read_parity cursor cursor-app-fixture-uuid "Second answer" + # --- Gemini list parity: proves .jsonl files are indexed and cwd is not null --- # # The Gemini list path had two pre-existing bugs fixed in v0.14.0: diff --git a/scripts/read_session.cjs b/scripts/read_session.cjs index 0010690..206cb33 100755 --- a/scripts/read_session.cjs +++ b/scripts/read_session.cjs @@ -26,8 +26,316 @@ function getPackageVersion() { } } +// Per-subcommand help blocks. Each function returns the lines that lead +// the help output when its topic is queried via `chorus --help`. +// Leading with subcommand-specific usage (not the global blob) makes every +// flag actually discoverable — see N4 acceptance criteria in +// research/next-scopes-post-v0.15.0-2026-06-03.md. +const SUBCOMMAND_HELP = { + read: (bin) => [ + `Usage: ${bin} read [options]`, + '', + 'Read assistant messages from a session (the default command).', + '', + 'Options:', + ' --agent Agent to read (default: codex)', + ' --id Session match (omit for latest in scope)', + ' --cwd Restrict to sessions for this cwd', + ' --chats-dir (gemini only) explicit chats dir', + ' --last Last N assistant messages', + ' --include-user Include user prompts that anchor responses', + ' --tool-calls Include tool call content (Read, Edit, Bash, ...)', + ' --history History scope: on-demand (default) | none | eager', + ' on-demand: latest session for cwd only', + ' none: metadata only (alias for --metadata-only)', + ' eager: reserved; behaves as on-demand + warning', + ' --format Output format: json (default with --json) | markdown', + ' --metadata-only Return session metadata without content', + ' --audit-redactions Include redaction audit trail', + ' --json Emit structured JSON', + '', + 'Examples:', + ` ${bin} read --agent codex --json`, + ` ${bin} read --agent claude --id --include-user`, + ], + list: (bin) => [ + `Usage: ${bin} list [options]`, + '', + 'List recent sessions for an agent.', + '', + 'Options:', + ' --agent Agent to list', + ' --cwd Restrict to sessions for this cwd', + ' --limit Max entries (default: 10)', + ' --json Emit structured JSON', + '', + 'Examples:', + ` ${bin} list --agent claude --limit 5 --json`, + ], + search: (bin) => [ + `Usage: ${bin} search [options]`, + '', + 'Search session content by query text.', + '', + 'Arguments:', + ' Query string (positional, required)', + '', + 'Options:', + ' --agent Agent to search (required)', + ' --cwd Restrict to sessions for this cwd', + ' --limit Max matches (default: 10)', + ' --json Emit structured JSON', + '', + 'Examples:', + ` ${bin} search "authentication" --agent gemini --json`, + ], + summary: (bin) => [ + `Usage: ${bin} summary [options]`, + '', + 'Structured session digest: message count, duration estimate, user', + 'requests, tool call counts, files referenced, last response snippet.', + 'No LLM calls — all extraction is local.', + '', + 'Options:', + ' --agent Agent (required)', + ' --id Session match (omit for latest in scope)', + ' --cwd Restrict to sessions for this cwd', + ' --chats-dir (gemini only) explicit chats dir', + ' --format Output format: markdown/md', + ' --json Emit structured JSON', + ], + timeline: (bin) => [ + `Usage: ${bin} timeline [options]`, + '', + 'Cross-agent chronological view, interleaving sessions by timestamp.', + '', + 'Options:', + ' --agent Agent filter (repeatable; default: all)', + ' --cwd Working directory (default: current)', + ' --limit Sessions per agent (default: 5)', + ' --format Output format: markdown/md', + ' --json Emit structured JSON', + ], + compare: (bin) => [ + `Usage: ${bin} compare --source [--source ...] [options]`, + '', + 'Compare outputs across agents and emit an analysis report.', + '', + 'Options:', + ' --source Source spec (repeatable, required)', + ' --cwd Working directory', + ' --normalize Normalize content before comparison', + ' --last Messages per source (default: 10)', + ' --json Emit structured JSON', + '', + 'Examples:', + ` ${bin} compare --source codex --source claude --json`, + ], + report: (bin) => [ + `Usage: ${bin} report --handoff [options]`, + '', + 'Generate a coordinator report from a handoff JSON file. The file must', + 'conform to the schema below — unknown fields are rejected.', + '', + 'Options:', + ' --handoff Path to handoff JSON (required)', + ' --cwd Working directory fallback', + ' --json Emit structured JSON', + '', + 'Handoff JSON schema:', + ' {', + ' "mode": "analyze", // required', + ' "task": "", // required', + ' "success_criteria": ["", ...], // required, non-empty', + ' "sources": [ // required, non-empty', + ' { // each source:', + ' "agent": "claude", // required', + ' "session_id": "", // OR set current_session:true', + ' "current_session": true, // use latest for cwd', + ' "cwd": "", // optional override', + ' "last_n": 10 // optional N msgs/source', + ' }', + ' ],', + ' "constraints": ["", ...] // optional', + ' }', + '', + 'Allowed top-level fields are exactly: mode, task, success_criteria,', + 'sources, constraints. Any other field causes INVALID_HANDOFF.', + '', + 'Minimal copy-pasteable example (write to handoff.json):', + ' {', + ' "mode": "analyze",', + ' "task": "Compare claude and codex outputs",', + ' "success_criteria": ["Identify agreements and contradictions"],', + ' "sources": [', + ' {"agent": "claude", "current_session": true},', + ' {"agent": "codex", "current_session": true}', + ' ]', + ' }', + '', + 'Examples:', + ` ${bin} report --handoff ./handoff.json --json`, + ], + setup: (bin) => [ + `Usage: ${bin} setup [options]`, + '', + 'Install cross-provider instruction scaffolding in this project.', + '', + 'Options:', + ' --cwd Target directory (default: current)', + ' --dry-run Print planned changes, no writes', + ' --force Replace existing managed blocks', + ' --context-pack Also build agent-context + install hooks', + ' --json Emit structured JSON', + '', + 'setup creates or updates:', + ' CLAUDE.md / AGENTS.md / GEMINI.md chorus managed blocks for agent wiring', + ' .agent-chorus/ provider snippets and intent contract', + ' .gitignore adds .agent-chorus/ to prevent tracking', + ' claude plugin auto-installs Claude Code plugin if claude CLI is present', + '', + 'Run teardown to reverse all per-project operations.', + 'The Claude Code plugin is global — uninstall separately if desired:', + ' claude plugin uninstall agent-chorus', + ], + teardown: (bin) => [ + `Usage: ${bin} teardown [options]`, + '', + 'Reverse setup: remove managed blocks, scaffolding, and hooks.', + '', + 'Options:', + ' --cwd Target directory (default: current)', + ' --dry-run Print planned changes, no writes', + ' --global Also remove ~/.cache/agent-chorus/', + ' --json Emit structured JSON', + '', + 'Removes managed blocks from CLAUDE.md/AGENTS.md/GEMINI.md,', + 'deletes .agent-chorus/ scaffolding, removes pre-push hook sentinel,', + 'and removes .agent-chorus/ from .gitignore.', + 'Context pack (.agent-context/) is preserved — remove manually if desired.', + '', + 'Note: the Claude Code plugin is NOT removed by teardown (it is global).', + 'To uninstall the plugin: claude plugin uninstall agent-chorus', + ], + doctor: (bin) => [ + `Usage: ${bin} doctor [options]`, + '', + 'Diagnostic checks across the agent-chorus install.', + '', + 'Options:', + ' --cwd Working directory (default: current)', + ' --json Emit structured JSON', + '', + 'Severity levels:', + ' pass Check succeeded.', + ' info Informational; not a problem (e.g. optional feature not configured).', + ' warn Actionable; the install can run but something should be fixed.', + ' fail Broken/unrecoverable.', + '', + 'Overall is elevated only by warn or fail; info does not elevate it.', + '', + 'Checks: version, session directories, setup completeness, provider', + 'instruction wiring, session availability, context pack state,', + 'Claude Code plugin installation, update status, hooks path + pre-push.', + ], + 'agent-context': (bin) => [ + `Usage: ${bin} agent-context [options]`, + '', + 'Build, sync, and install agent-context automation.', + '', + 'Subcommands:', + ' build [--reason ] [--base ] [--head ] [--force-snapshot]', + ' init [--pack-dir ] [--cwd ] [--force]', + ' seal [--reason ] [--base ] [--head ] [--pack-dir ] [--cwd ] [--force] [--force-snapshot]', + ' sync-main --local-ref --local-sha --remote-ref --remote-sha ', + ' install-hooks', + ' rollback [--snapshot ]', + ' check-freshness [--base ]', + ], + send: (bin) => [ + `Usage: ${bin} send --from --to --message [options]`, + '', + 'Send a message from one agent to another. The message lands in the', + 'recipient agent\'s inbox under .agent-chorus/messages/.', + '', + 'Options:', + ' --from Sending agent (required)', + ' --to Target agent (required)', + ' --message Message content (required)', + ' --cwd Working directory', + ' --json Emit structured JSON', + ], + messages: (bin) => [ + `Usage: ${bin} messages --agent [options]`, + '', + 'Read messages from an agent\'s inbox.', + '', + 'Options:', + ' --agent Agent whose messages to read', + ' --clear Drain inbox after reading', + ' (use at session standup)', + ' --cwd Working directory', + ' --json Emit structured JSON', + '', + 'Examples:', + ` ${bin} messages --agent claude --json`, + ` ${bin} messages --agent claude --clear # drain inbox after read`, + ], + checkpoint: (bin) => [ + `Usage: ${bin} checkpoint --from [options]`, + '', + 'Broadcast git state to every other agent\'s inbox. No-ops silently', + 'when .agent-chorus/ does not exist (safe to install globally).', + '', + 'Options:', + ' --from Sending agent (claude|codex|gemini|cursor)', + ' --message Override auto-composed message', + ' --cwd Working directory', + ' --json Emit structured JSON', + ], + diff: (bin) => [ + `Usage: ${bin} diff --agent --from --to [options]`, + '', + 'Compare two sessions from the same agent.', + '', + 'Options:', + ' --agent Agent (required)', + ' --from First session (substring match)', + ' --to Second session (substring match)', + ' --last Messages per session (default: 1)', + ' --cwd Working directory', + ' --json Emit structured JSON', + ], + relevance: (bin) => [ + `Usage: ${bin} relevance [--list | --test | --suggest] [options]`, + '', + 'Inspect relevance patterns for agent-context filtering.', + '', + 'Options:', + ' --list List current include/exclude patterns', + ' --test Test whether a file path is relevant', + ' --suggest Suggest patterns from project conventions', + ' --cwd Working directory (default: current)', + ' --json Emit structured JSON', + ], +}; + function printHelp(topic = null) { const binName = path.basename(process.argv[1] || 'chorus'); + + // Per-subcommand help: lead with that subcommand's usage. The global + // command list is deliberately omitted to keep relevant flags above + // the fold — N4 acceptance. + if (topic && (SUBCOMMAND_HELP[topic] || topic === 'context-pack')) { + const builder = SUBCOMMAND_HELP[topic] || SUBCOMMAND_HELP['agent-context']; + const lines = builder(binName); + lines.push(''); + lines.push(`Run \`${binName} --help\` for the full command list.`); + console.log(lines.join('\n')); + return; + } + + // Top-level help: full command listing. const lines = [ `Agent Chorus CLI v${getPackageVersion()}`, '', @@ -53,7 +361,7 @@ function printHelp(topic = null) { ' relevance Inspect relevance patterns for agent-context filtering', '', 'Global Flags:', - ' -h, --help Show help', + ' -h, --help Show help (use ` --help` for per-command details)', ' -v, --version Show version', '', 'Examples:', @@ -68,165 +376,6 @@ function printHelp(topic = null) { ` ${binName} agent-context build`, ]; - if (topic === 'read') { - lines.push(''); - lines.push('read options:'); - lines.push(' --agent (default: codex)'); - lines.push(' --id (optional; omitted = latest session in scope)'); - lines.push(' --cwd '); - lines.push(' --chats-dir (gemini)'); - lines.push(' --last '); - lines.push(' --include-user Include the latest user prompt(s) that anchor returned assistant messages'); - lines.push(' --tool-calls Include tool call content (Read, Edit, Bash, etc.) in output'); - lines.push(' --format Output format: json (default with --json), markdown/md'); - lines.push(' --metadata-only Return session metadata without content'); - lines.push(' --audit-redactions Include redaction audit trail in output'); - lines.push(' --json'); - } else if (topic === 'list') { - lines.push(''); - lines.push('list options:'); - lines.push(' --agent '); - lines.push(' --cwd '); - lines.push(' --limit (default: 10)'); - lines.push(' --json'); - } else if (topic === 'search') { - lines.push(''); - lines.push('search options:'); - lines.push(' (positional, required)'); - lines.push(' --agent (required)'); - lines.push(' --cwd '); - lines.push(' --limit (default: 10)'); - lines.push(' --json'); - } else if (topic === 'summary') { - lines.push(''); - lines.push('summary options:'); - lines.push(' --agent (required)'); - lines.push(' --id (optional; omitted = latest session in scope)'); - lines.push(' --cwd '); - lines.push(' --chats-dir (gemini)'); - lines.push(' --format Output format: markdown/md'); - lines.push(' --json'); - lines.push(''); - lines.push(' Produces a structured digest: message count, duration estimate,'); - lines.push(' user requests, tool call counts, files referenced, last response snippet.'); - lines.push(' No LLM calls — all extraction is local.'); - } else if (topic === 'timeline') { - lines.push(''); - lines.push('timeline options:'); - lines.push(' --agent (repeatable; default: all four agents)'); - lines.push(' --cwd (default: current directory)'); - lines.push(' --limit Sessions per agent (default: 5)'); - lines.push(' --format Output format: markdown/md'); - lines.push(' --json'); - lines.push(''); - lines.push(' Cross-agent chronological view interleaving sessions by timestamp.'); - } else if (topic === 'compare') { - lines.push(''); - lines.push('compare options:'); - lines.push(' --source (repeatable, required)'); - lines.push(' --cwd '); - lines.push(' --normalize'); - lines.push(' --last Messages per source (default: 10)'); - lines.push(' --json'); - } else if (topic === 'report') { - lines.push(''); - lines.push('report options:'); - lines.push(' --handoff (required)'); - lines.push(' --cwd '); - lines.push(' --json'); - } else if (topic === 'setup') { - lines.push(''); - lines.push('setup options:'); - lines.push(' --cwd (default: current directory)'); - lines.push(' --dry-run'); - lines.push(' --force (replace existing managed blocks)'); - lines.push(' --context-pack (also build agent-context and install hooks)'); - lines.push(' --json'); - lines.push(''); - lines.push('setup creates or updates:'); - lines.push(' CLAUDE.md / AGENTS.md / GEMINI.md chorus managed blocks for agent wiring'); - lines.push(' .agent-chorus/ provider snippets and intent contract'); - lines.push(' .gitignore adds .agent-chorus/ to prevent tracking'); - lines.push(' claude plugin auto-installs Claude Code plugin if claude CLI is present'); - lines.push(''); - lines.push('Run teardown to reverse all per-project operations.'); - lines.push('The Claude Code plugin is global — uninstall separately if desired:'); - lines.push(' claude plugin uninstall agent-chorus'); - } else if (topic === 'teardown') { - lines.push(''); - lines.push('teardown options:'); - lines.push(' --cwd (default: current directory)'); - lines.push(' --dry-run'); - lines.push(' --global (also remove ~/.cache/agent-chorus/ update-check cache)'); - lines.push(' --json'); - lines.push(''); - lines.push('Removes managed blocks from CLAUDE.md/AGENTS.md/GEMINI.md,'); - lines.push('deletes .agent-chorus/ scaffolding, removes pre-push hook sentinel,'); - lines.push('and removes .agent-chorus/ from .gitignore.'); - lines.push('Context pack (.agent-context/) is preserved — remove manually if desired.'); - lines.push(''); - lines.push('Note: the Claude Code plugin is NOT removed by teardown (it is global).'); - lines.push('To uninstall the plugin: claude plugin uninstall agent-chorus'); - } else if (topic === 'doctor') { - lines.push(''); - lines.push('doctor options:'); - lines.push(' --cwd (default: current directory)'); - lines.push(' --json'); - lines.push(''); - lines.push('Checks: version, session directories, setup completeness, provider'); - lines.push('instruction wiring, session availability, context pack state,'); - lines.push('Claude Code plugin installation, and update status.'); - } else if (topic === 'agent-context' || topic === 'context-pack') { - lines.push(''); - lines.push('agent-context usage:'); - lines.push(' agent-context build [--reason ] [--base ] [--head ] [--force-snapshot]'); - lines.push(' agent-context init [--pack-dir ] [--cwd ] [--force]'); - lines.push(' agent-context seal [--reason ] [--base ] [--head ] [--pack-dir ] [--cwd ] [--force] [--force-snapshot]'); - lines.push(' agent-context sync-main --local-ref --local-sha --remote-ref --remote-sha '); - lines.push(' agent-context install-hooks'); - lines.push(' agent-context rollback [--snapshot ]'); - lines.push(' agent-context check-freshness [--base ]'); - } else if (topic === 'send') { - lines.push(''); - lines.push('send options:'); - lines.push(' --from Sending agent'); - lines.push(' --to Target agent'); - lines.push(' --message Message content'); - lines.push(' --cwd Working directory'); - lines.push(' --json Emit structured JSON'); - } else if (topic === 'messages') { - lines.push(''); - lines.push('messages options:'); - lines.push(' --agent Agent whose messages to read'); - lines.push(' --clear Clear messages after reading'); - lines.push(' --cwd Working directory'); - lines.push(' --json Emit structured JSON'); - } else if (topic === 'checkpoint') { - lines.push(''); - lines.push('checkpoint options:'); - lines.push(' --from Sending agent (claude|codex|gemini|cursor)'); - lines.push(' --message Override the auto-composed state message'); - lines.push(' --cwd Working directory'); - lines.push(' --json Emit structured JSON'); - } else if (topic === 'diff') { - lines.push(''); - lines.push('diff options:'); - lines.push(' --agent '); - lines.push(' --from First session ID (substring match)'); - lines.push(' --to Second session ID (substring match)'); - lines.push(' --last Messages per session (default: 1)'); - lines.push(' --cwd Working directory'); - lines.push(' --json Emit structured JSON'); - } else if (topic === 'relevance') { - lines.push(''); - lines.push('relevance options:'); - lines.push(' --list List current include/exclude patterns'); - lines.push(' --test Test whether a file path is relevant'); - lines.push(' --suggest Suggest patterns based on project conventions'); - lines.push(' --cwd Working directory (default: current directory)'); - lines.push(' --json Emit structured JSON'); - } - console.log(lines.join('\n')); } @@ -424,6 +573,12 @@ function makeManagedBlock(provider, snippetRelPath) { 'When a user asks for another agent status (for example "What is Claude doing?"),', 'run Agent Chorus commands first and answer with evidence from session output.', '', + '**History contract (READ FIRST — violating this costs 2.5x tokens):**', + '- `chorus read` defaults to `--history=on-demand` — latest session for the cwd ONLY.', + '- Do NOT loop through prior sessions at session start. The field study measured a 2.5x token inflation when agents eagerly read history.', + '- When you need historical context, call `chorus list / timeline / search` EXPLICITLY. That is the on-demand recall mechanism.', + '- `--history=eager` is reserved for a future multi-session merge and currently emits a warning; do not depend on it.', + '', 'Session routing and defaults:', '1. For status checks like "What is Claude doing?", start with `chorus read --agent --cwd --include-user --json` (omit `--id` for latest).', '2. For plain handoff/output checks, use `chorus read --agent --cwd --json`.', @@ -436,6 +591,13 @@ function makeManagedBlock(provider, snippetRelPath) { '- `chorus list --agent --cwd --json`', '- `chorus search "" --agent --cwd --json`', '- `chorus compare --source codex --source gemini --source claude --cwd --json`', + '- `chorus diff --agent --from --to --cwd --json`', + '- `chorus read --agent --cwd --audit-redactions --json`', + '- `chorus relevance --list --cwd --json`', + '- `chorus send --from --to --message "" --cwd `', + '- `chorus messages --agent --cwd --json`', + '', + '(History contract is at the top of this block — see above.)', '', 'If command syntax is unclear, run `chorus --help`.', ``, @@ -1734,6 +1896,12 @@ function renderTimelineAsMarkdown(result) { console.log(lines.join('\n')); } +// Agents whose on-disk transcript format has no tool-call concept. +// `--tool-calls` is honored (included_tool_calls is still emitted) but +// a uniform warning surfaces that the data is structurally unavailable. +// Mirrors `agent_has_no_tool_calls` in cli/src/main.rs. +const AGENTS_WITHOUT_TOOL_CALLS = new Set(['gemini', 'hermes']); + function runRead(inputArgs) { const agent = getOptionValue(inputArgs, '--agent', 'codex'); const id = getOptionValue(inputArgs, '--id', null); @@ -1747,6 +1915,20 @@ function runRead(inputArgs) { const includeToolCalls = hasFlag(inputArgs, '--tool-calls'); const format = getOptionValue(inputArgs, '--format', null); + // N7: --history=on-demand|none|eager. Default `on-demand` returns only + // the latest session for the cwd — chorus does NOT auto-pull prior + // sessions; consumers call `chorus list/timeline/search` explicitly + // when they need historical context. `none` is an alias for + // --metadata-only; `eager` is reserved for a future multi-session + // merge and currently emits a warning so consumers don't silently + // rely on it. + const history = getOptionValue(inputArgs, '--history', 'on-demand'); + const validHistory = new Set(['on-demand', 'none', 'eager']); + if (!validHistory.has(history)) { + throw new Error(`Invalid --history value: ${history}. Allowed: on-demand | none | eager.`); + } + const historyMetadataOnly = history === 'none' || metadataOnly; + const result = readSessionViaAdapter(agent, { id, cwd, @@ -1756,10 +1938,44 @@ function runRead(inputArgs) { includeToolCalls, }); + // N6: agents whose transcript format has no tool-call concept emit a + // uniform warning when --tool-calls is requested, so a silent no-op + // never looks like "this agent had no tool calls". included_tool_calls + // is still true (the flag was honored); the warning surfaces that the + // data is structurally unavailable. Mirrors Rust dispatch. + if (includeToolCalls && AGENTS_WITHOUT_TOOL_CALLS.has(agent)) { + if (!Array.isArray(result.warnings)) result.warnings = []; + result.warnings.push(`--tool-calls has no effect for ${agent} sessions: this agent's transcript format does not carry tool calls.`); + } + + // F1: surface cwd-mismatch fallback as a structured boolean on the + // output AND escalate the warning to stderr. Adapters push a + // "falling back to latest session" warning string when --cwd was + // given but no session matched; JSON-only consumers can use + // cwd_mismatch=true to detect this without scanning warning strings, + // and stderr-watching humans see the message immediately. + if (Array.isArray(result.warnings) + && result.warnings.some((w) => typeof w === 'string' && w.includes('falling back to latest session'))) { + result.cwd_mismatch = true; + for (const w of result.warnings) { + if (typeof w === 'string' && w.includes('falling back to latest session')) { + process.stderr.write(`chorus: ${w}\n`); + } + } + } + + // N7: --history=eager is reserved; emit a warning rather than silently + // honoring the flag with on-demand behavior. Mirrors Rust dispatch. + if (history === 'eager') { + if (!Array.isArray(result.warnings)) result.warnings = []; + result.warnings.push('--history=eager is reserved for a future multi-session merge and currently behaves identically to --history=on-demand. Use `chorus list / timeline / search` to pull additional sessions explicitly.'); + } + + if (format === 'markdown' || format === 'md') { renderReadAsMarkdown(result); } else { - renderReadResult(result, asJson, metadataOnly, auditRedactions); + renderReadResult(result, asJson, historyMetadataOnly, auditRedactions); } } @@ -1885,6 +2101,12 @@ function runSetup(inputArgs) { '- `chorus search "" --agent --cwd --json`', '- `chorus compare --source codex --source gemini --source claude --cwd --json`', '', + 'History contract (do NOT eagerly read multiple prior sessions):', + '- `chorus read` defaults to `--history=on-demand` — latest session for the cwd ONLY.', + "- Do NOT loop through prior sessions at session start. The field study (research/context-pack-field-findings-2026-03-20.md, Finding 3) measured a 2.5x token inflation when agents eagerly read history. Honor on-demand by default.", + '- When you genuinely need historical context, call `chorus list / timeline / search` explicitly. That\'s the on-demand recall mechanism.', + '- `--history=eager` is reserved for a future multi-session merge and currently emits a warning; do not depend on it.', + '', 'Use evidence from command output and explicitly report missing session data.', ].join('\n'); @@ -2198,21 +2420,33 @@ function runDoctor(inputArgs) { addCheck(id, fs.existsSync(dirPath) ? 'pass' : 'warn', fs.existsSync(dirPath) ? `Found: ${dirPath}` : `Missing: ${dirPath}`); } + // Setup scaffolding. The integration/snippet/intents checks emit `info` + // rather than `warn` when the repo has not been initialized via + // `chorus setup` — un-setup is intentional state, not broken state. + // + // Initialization is detected by the presence of either INTENTS.md or + // the providers/ directory under .agent-chorus/. The bare .agent-chorus/ + // directory alone is *not* a setup signal: the messaging subsystem + // creates .agent-chorus/messages/ on first `send`, independent of any + // setup step. Mirrors Rust doctor. const setupRoot = path.join(cwd, '.agent-chorus'); + const setupInitialized = fs.existsSync(path.join(setupRoot, 'INTENTS.md')) + || fs.existsSync(path.join(setupRoot, 'providers')); + const absentStatus = setupInitialized ? 'warn' : 'info'; const intentsPath = path.join(setupRoot, 'INTENTS.md'); - addCheck('setup_intents', fs.existsSync(intentsPath) ? 'pass' : 'warn', fs.existsSync(intentsPath) ? `Found: ${intentsPath}` : `Missing: ${intentsPath}`); + addCheck('setup_intents', fs.existsSync(intentsPath) ? 'pass' : absentStatus, fs.existsSync(intentsPath) ? `Found: ${intentsPath}` : `Missing: ${intentsPath}`); for (const provider of setupProviders) { const snippetPath = path.join(setupRoot, 'providers', `${provider.agent}.md`); addCheck( `snippet_${provider.agent}`, - fs.existsSync(snippetPath) ? 'pass' : 'warn', + fs.existsSync(snippetPath) ? 'pass' : absentStatus, fs.existsSync(snippetPath) ? `Found: ${snippetPath}` : `Missing: ${snippetPath}` ); const targetPath = path.join(cwd, provider.targetFile); if (!fs.existsSync(targetPath)) { - addCheck(`integration_${provider.agent}`, 'warn', `Missing provider instruction file: ${targetPath}`); + addCheck(`integration_${provider.agent}`, absentStatus, `Missing provider instruction file: ${targetPath}`); continue; } @@ -2220,12 +2454,15 @@ function runDoctor(inputArgs) { const marker = `agent-chorus:${provider.agent}:start`; addCheck( `integration_${provider.agent}`, - content.includes(marker) ? 'pass' : 'warn', + content.includes(marker) ? 'pass' : absentStatus, content.includes(marker) ? `Managed block present in ${targetPath}` : `Managed block missing in ${targetPath}` ); } - for (const agent of ['codex', 'gemini', 'claude', 'cursor', 'hermes']) { + // Single-surface agents. Hermes is handled below with surface + // detection (F12 parity with cursor) so it can report `info` when + // its data directory is absent. + for (const agent of ['codex', 'gemini', 'claude']) { try { const entries = listSessions(agent, cwd, 1); if (entries.length > 0) { @@ -2238,6 +2475,151 @@ function runDoctor(inputArgs) { } } + // Cursor has two on-disk surfaces; report each independently. F12: a + // surface whose data directory does not exist at all is intentional + // un-installed state, not broken state — report `info` rather than + // `warn`. `warn` is reserved for "directory exists but zero sessions + // discoverable" (an installed tool that's not producing data, which + // is worth flagging). Mirrors cli/src/doctor.rs::cursor_surface_checks. + try { + const cursorAdapter = getAdapter('cursor'); + const cursorEntries = cursorAdapter.list(null, 50); + const cliCount = cursorEntries.filter((e) => e && e.source === 'cli').length; + const appCount = cursorEntries.filter((e) => e && e.source === 'app').length; + const cliBase = normalizePath(process.env.CHORUS_CURSOR_DATA_DIR || '~/.cursor/projects'); + const cliBaseExists = fs.existsSync(cliBase); + if (!cliBaseExists) { + addCheck( + 'sessions_cursor_cli', + 'info', + `cursor-agent CLI not configured (data directory absent: ${cliBase})`, + ); + } else if (cliCount > 0) { + addCheck('sessions_cursor_cli', 'pass', 'At least one cursor-agent CLI transcript discovered'); + } else { + addCheck('sessions_cursor_cli', 'warn', `No cursor-agent CLI transcripts discovered at ${cliBase}`); + } + const cursorApp = require('./adapters/cursor_app.cjs'); + const appBase = cursorApp.cursorAppBaseDir(); + const appBaseExists = fs.existsSync(appBase); + if (!appBaseExists) { + addCheck( + 'sessions_cursor_app', + 'info', + `Cursor IDE not configured (data directory absent: ${appBase})`, + ); + } else if (appCount > 0) { + addCheck('sessions_cursor_app', 'pass', 'At least one Cursor IDE store.db discovered'); + } else { + addCheck( + 'sessions_cursor_app', + 'warn', + cursorApp.isSqliteAvailable() + ? `No Cursor IDE store.db sessions discovered at ${appBase}` + : 'Cursor IDE SQLite reader unavailable (requires Node >= 22.5 with node:sqlite)', + ); + } + } catch (error) { + addCheck('sessions_cursor_cli', 'fail', error.message || String(error)); + addCheck('sessions_cursor_app', 'fail', error.message || String(error)); + } + + // Hermes is provisional (F12 parity with cursor surfaces). Report + // `info` when the data directory is absent. + try { + const hermesBase = normalizePath(process.env.CHORUS_HERMES_DATA_DIR || process.env.BRIDGE_HERMES_DATA_DIR || '~/.hermes/sessions'); + if (!fs.existsSync(hermesBase)) { + addCheck( + 'sessions_hermes', + 'info', + `Hermes not configured (data directory absent: ${hermesBase})`, + ); + } else { + // Presence-only check (matches Rust + cursor surface checks): is + // the surface reachable from this host, not "does it have sessions + // matching this specific cwd". + const entries = listSessions('hermes', null, 1); + if (entries.length > 0) { + addCheck('sessions_hermes', 'pass', 'At least one hermes session discovered'); + } else { + addCheck('sessions_hermes', 'warn', `No hermes sessions discovered at ${hermesBase}`); + } + } + } catch (error) { + addCheck('sessions_hermes', 'fail', error.message || String(error)); + } + + // R2-fix: stale-snippet sentinel. A provider snippet or managed block + // that exists but predates the v0.16.0 history contract silently + // leaves consumer agents without the on-demand rule. `chorus setup + // --force` refreshes them; doctor surfaces the gap. Mirrors Rust + // stale_snippet_checks. + const stalenessProbe = 'History contract'; + const providersDir = path.join(setupRoot, 'providers'); + for (const agent of ['codex', 'claude', 'gemini']) { + const snippetPath = path.join(providersDir, `${agent}.md`); + if (!fs.existsSync(snippetPath)) continue; + let content = ''; + try { content = fs.readFileSync(snippetPath, 'utf-8'); } catch (_) { continue; } + if (!content.includes(stalenessProbe)) { + addCheck( + `snippet_${agent}_stale`, + 'warn', + `${snippetPath} predates the v0.16.0 history contract. Run \`chorus setup --force\` to refresh.`, + ); + } + } + const integrationFiles = [ + ['codex', path.join(cwd, 'AGENTS.md')], + ['claude', path.join(cwd, 'CLAUDE.md')], + ['gemini', path.join(cwd, 'GEMINI.md')], + ]; + for (const [agent, intgPath] of integrationFiles) { + if (!fs.existsSync(intgPath)) continue; + let content = ''; + try { content = fs.readFileSync(intgPath, 'utf-8'); } catch (_) { continue; } + const marker = `agent-chorus:${agent}:start`; + if (!content.includes(marker)) continue; + if (!content.includes(stalenessProbe)) { + addCheck( + `integration_${agent}_stale`, + 'warn', + `Managed block in ${intgPath} predates the v0.16.0 history contract. Run \`chorus setup --force\` to refresh.`, + ); + } + } + + // F2: env-var overrides pointing at non-existent directories produce + // silent partial coverage that looks identical to a working install. + // Flag these as `warn` so users know their env is misconfigured. + // Mirrors cli/src/doctor.rs::env_override_checks. + const envOverrides = [ + ['CHORUS_CODEX_SESSIONS_DIR', 'codex'], + ['BRIDGE_CODEX_SESSIONS_DIR', 'codex (legacy)'], + ['CHORUS_CLAUDE_PROJECTS_DIR', 'claude'], + ['BRIDGE_CLAUDE_PROJECTS_DIR', 'claude (legacy)'], + ['CHORUS_GEMINI_TMP_DIR', 'gemini'], + ['BRIDGE_GEMINI_TMP_DIR', 'gemini (legacy)'], + ['CHORUS_CURSOR_DATA_DIR', 'cursor-agent CLI'], + ['BRIDGE_CURSOR_DATA_DIR', 'cursor-agent CLI (legacy)'], + ['CHORUS_CURSOR_APP_DATA_DIR', 'Cursor IDE'], + ['BRIDGE_CURSOR_APP_DATA_DIR', 'Cursor IDE (legacy)'], + ['CHORUS_HERMES_DATA_DIR', 'hermes'], + ['BRIDGE_HERMES_DATA_DIR', 'hermes (legacy)'], + ]; + for (const [varName, label] of envOverrides) { + const value = process.env[varName]; + if (!value) continue; + const expanded = normalizePath(value); + if (!fs.existsSync(expanded)) { + addCheck( + 'env_override_dangling', + 'warn', + `${varName} (${label}) points at non-existent directory: ${expanded}. Sessions from this adapter will be invisible until the env var is cleared or the directory exists.`, + ); + } + } + const packDir = path.join(cwd, '.agent-context', 'current'); const packManifestPath = path.join(packDir, 'manifest.json'); let packState = 'UNINITIALIZED'; @@ -2309,38 +2691,63 @@ function runDoctor(inputArgs) { addCheck('claude_plugin', 'warn', 'claude CLI not found — Claude Code plugin status unknown'); } - let hooksPath = null; + // Git hooks path + pre-push. F3: doctor reports the LOCAL health of + // this install in this cwd. If the cwd is not a git repo, neither + // check is meaningful — `git config core.hooksPath` would resolve to + // a global value and we'd truthfully report a hook as installed even + // though the cwd has no `.git/`. Gate both checks on the cwd being a + // git repo. Mirrors Rust doctor. + let cwdIsGitRepo = false; try { - hooksPath = execFileSync('git', ['config', '--get', 'core.hooksPath'], { + execFileSync('git', ['rev-parse', '--git-dir'], { cwd, - encoding: 'utf8', - stdio: ['ignore', 'pipe', 'pipe'], - }).trim() || null; + stdio: ['ignore', 'ignore', 'ignore'], + }); + cwdIsGitRepo = true; } catch (_error) { - hooksPath = null; + cwdIsGitRepo = false; } - if (hooksPath) { + if (cwdIsGitRepo) { + let configuredHooksPath = null; + try { + configuredHooksPath = execFileSync('git', ['config', '--get', 'core.hooksPath'], { + cwd, + encoding: 'utf8', + stdio: ['ignore', 'pipe', 'pipe'], + }).trim() || null; + } catch (_error) { + configuredHooksPath = null; + } + const effectiveHooksPath = configuredHooksPath || '.git/hooks'; + const hooksPathSource = configuredHooksPath ? 'configured' : 'default'; addCheck( 'context_pack_hooks_path', - hooksPath === '.githooks' ? 'pass' : 'warn', - hooksPath === '.githooks' - ? 'Git hooks path set to .githooks' - : `Git hooks path is ${hooksPath} (expected .githooks for context-pack pre-push automation)` + 'info', + `Effective git hooks path: ${effectiveHooksPath} (${hooksPathSource})` ); - const prePushPath = path.isAbsolute(hooksPath) - ? path.join(hooksPath, 'pre-push') - : path.join(cwd, hooksPath, 'pre-push'); + const prePushPath = path.isAbsolute(effectiveHooksPath) + ? path.join(effectiveHooksPath, 'pre-push') + : path.join(cwd, effectiveHooksPath, 'pre-push'); const prePushExists = fs.existsSync(prePushPath); addCheck( 'context_pack_pre_push', prePushExists ? 'pass' : 'warn', prePushExists ? `Found: ${prePushPath}` - : `Missing: ${prePushPath} (run: chorus context-pack install-hooks)` + : `Missing: ${prePushPath} (run: chorus agent-context install-hooks)` ); } else { - addCheck('context_pack_hooks_path', 'warn', 'Git hooks path not configured'); + addCheck( + 'context_pack_hooks_path', + 'info', + 'cwd is not a git repository; git hooks checks skipped' + ); + addCheck( + 'context_pack_pre_push', + 'info', + 'cwd is not a git repository; pre-push hook check skipped' + ); } const hasFail = checks.some(c => c.status === 'fail'); @@ -3238,7 +3645,55 @@ function runTimeline(inputArgs) { } } +// F11: reject unknown flags at dispatch. The hand-rolled parser +// previously silently ignored typos like `--Json` or `--limt 3`, which +// then quietly behaved as if the flag were absent. The Rust CLI (clap) +// already fails closed on unknown flags; this brings Node to parity. +// +// agent-context is excluded because it has its own nested subcommand +// parser (the args after `chorus agent-context ` belong to that +// sub). Same for trash-talk which takes a freeform agent list. +const ALLOWED_FLAGS = { + read: ['--agent', '--id', '--cwd', '--chats-dir', '--last', '--include-user', '--tool-calls', '--history', '--format', '--metadata-only', '--audit-redactions', '--json'], + list: ['--agent', '--cwd', '--limit', '--json'], + search: ['--agent', '--cwd', '--limit', '--json'], + compare: ['--source', '--cwd', '--normalize', '--last', '--json'], + diff: ['--agent', '--from', '--to', '--last', '--cwd', '--json'], + report: ['--handoff', '--cwd', '--json'], + send: ['--from', '--to', '--message', '--cwd', '--json'], + messages: ['--agent', '--clear', '--cwd', '--json'], + checkpoint: ['--from', '--message', '--cwd', '--json'], + setup: ['--cwd', '--dry-run', '--force', '--context-pack', '--json'], + teardown: ['--cwd', '--dry-run', '--global', '--json'], + doctor: ['--cwd', '--json'], + summary: ['--agent', '--id', '--cwd', '--chats-dir', '--format', '--json'], + timeline: ['--agent', '--cwd', '--limit', '--format', '--json'], + relevance: ['--list', '--test', '--suggest', '--cwd', '--json'], + // trash-talk is an easter-egg and takes no documented flags. Validating + // here ensures users don't accidentally pass a real command's flag and + // get silent garbage. + 'trash-talk': ['--cwd', '--json'], + // agent-context dispatches into its own subparser; the top-level flags + // we know about are --cwd (passed through) and --json. The nested + // subcommands (build/seal/etc.) validate their own flag sets. + 'agent-context': ['--cwd', '--json', '--reason', '--base', '--head', '--force-snapshot', '--force', '--pack-dir', '--local-ref', '--local-sha', '--remote-ref', '--remote-sha', '--snapshot', '--latest-good', '--ci', '--enforce-separate-commits'], + 'context-pack': ['--cwd', '--json', '--reason', '--base', '--head', '--force-snapshot', '--force', '--pack-dir', '--local-ref', '--local-sha', '--remote-ref', '--remote-sha', '--snapshot', '--latest-good', '--ci', '--enforce-separate-commits'], +}; +function validateFlags(cmd, inputArgs) { + const allowed = ALLOWED_FLAGS[cmd]; + if (!allowed) return; + const set = new Set(allowed); + for (const arg of inputArgs) { + if (typeof arg !== 'string' || !arg.startsWith('--')) continue; + const name = arg.split('=')[0]; + if (!set.has(name)) { + throw new Error(`Unknown flag for '${cmd}': ${name}. Run \`chorus ${cmd} --help\` to see allowed flags.`); + } + } +} + try { + validateFlags(command, args); if (command === 'read') { runRead(args); } else if (command === 'compare') {