Skip to content

feat(core): atomic capability chaining with two-phase GraphQL batching#59

Merged
aryeko merged 30 commits intomainfrom
feat/atomic-chaining
Feb 20, 2026
Merged

feat(core): atomic capability chaining with two-phase GraphQL batching#59
aryeko merged 30 commits intomainfrom
feat/atomic-chaining

Conversation

@aryeko
Copy link
Owner

@aryeko aryeko commented Feb 19, 2026

Summary

  • Adds executeTasks() — executes N capabilities in a single call with at most 2 HTTP round-trips (Phase 1: batch resolution queries, Phase 2: batch mutations)
  • Adds graphql.resolution config to YAML operation cards so cards declare their own ID-lookup requirements; 6 cards updated (issue.labels.update, issue.labels.add, issue.assignees.update, issue.milestone.set, issue.parent.remove, issue.create)
  • Adds ghx chain --steps '<json-array>' CLI subcommand
  • Deletes composite capability system (pr.threads.composite, issue.triage.composite, issue.update.composite — never published, safe to remove)
  • Exports executeTasks, ChainResultEnvelope, ChainStepResult, ChainStatus from public API

Architecture

Two-phase batch execution:

  1. Pre-flight: validate all steps (card exists, input schema, GraphQL route required)
  2. Phase 1: batch all resolution queries into a single query BatchChain(...) call
  3. Phase 2: batch all mutations into a single mutation BatchComposite(...) call

Single-item executeTasks delegates to existing executeTask — no change to existing callsites.

Card-defined resolution (graphql.resolution):

graphql:
  resolution:
    lookup:
      operationName: IssueLabelsLookup
      vars:
        issueId: issueId
    inject:
      - target: labelIds
        source: map_array
        from_input: labels
        nodes_path: node.repository.labels.nodes
        match_field: name
        extract_field: id

Test plan

  • pnpm run ci --outputStyle=static passes (920 tests, 97%+ coverage)
  • Code reviewed — no critical issues, 100% CLAUDE.md compliance
  • Security reviewed — no vulnerabilities, OWASP Top 10 checked, all attack surfaces cleared
  • Changeset added (minor bump for @ghx-dev/core)
  • Composite cards deleted (never published; 0.1.2 is last release with composites)

New public exports

executeTasks(requests: Array<{ task: string; input: Record<string, unknown> }>, deps: ExecutionDeps): Promise<ChainResultEnvelope>
type ChainResultEnvelope = { status: "success" | "partial" | "failed"; results: ChainStepResult[]; meta: { route_used: "graphql"; total: number; succeeded: number; failed: number } }
type ChainStepResult = { task: string; ok: boolean; data?: unknown; error?: ResultError }
type ChainStatus = "success" | "partial" | "failed"

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Atomic capability chaining: executeTasks() API and ghx chain CLI to run multi-step mutations atomically (≤2 API round-trips)
    • New chain result types: ChainResultEnvelope, ChainStepResult, ChainStatus
  • Refactor

    • Composite capability subsystem removed; capability catalog adjusted (69 → 66) and several capabilities renamed (e.g., issue.labels.set)
  • Documentation

    • New guides, examples, and CLI docs for chaining, resolution, and result envelopes
  • Tests

    • Added unit tests for batching, resolution, chaining, registry, and CLI behavior

aryeko and others added 14 commits February 20, 2026 00:04
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…interface

- Replace single-document GQL batching with concurrent Promise.all via
  getGraphqlHandler registry (handles multi-step caps naturally)
- Add ghx chain --steps CLI interface
- Remove buildBatchQuery/parseOperation from additions (no longer needed)
- Add composite schema + preferredOrder cleanup to deletions table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the unused composite capability system (issue.triage.composite,
issue.update.composite, pr.threads.composite) and update all references.
Composite capabilities were an early approach to multi-step execution that
has been superseded by the new atomic chaining feature (executeTasks).

This includes:
- Delete composite.ts executor module
- Delete 3 composite YAML capability cards
- Remove composite fields from operation card schema and types
- Remove composite routing logic from engine
- Delete 5 composite test files
- Update registry and CLI tests to remove composite IDs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add public API exports for the new atomic chaining feature:
- executeTasks() function for executing multiple capabilities in one call
- ChainResultEnvelope, ChainStepResult, ChainStatus types for chain results

This enables agents to use the new chaining capability via the public API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add changeset documenting the new atomic capability chaining feature and
removal of the unused composite system.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Feb 19, 2026

🦋 Changeset detected

Latest commit: aff0bd2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@ghx-dev/core Minor
@ghx-dev/benchmark Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

Warning

Rate limit exceeded

@aryeko has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 46 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Implements atomic capability chaining: adds executeTasks() with two‑phase GraphQL batching (pre‑flight validation, resolution lookups, mutations), removes the composite capability subsystem, introduces ghx chain --steps, extends operation cards with graphql.resolution, and updates docs, registry schemas, tests, and exports.

Changes

Cohort / File(s) Summary
Core Engine & Contracts
packages/core/src/core/routing/engine.ts, packages/core/src/core/contracts/envelope.ts
Added executeTasks() with pre‑flight validation and two‑phase orchestration (Phase 1 lookups, Phase 2 mutations). Introduced ChainStatus, ChainStepResult, and ChainResultEnvelope.
CLI
packages/core/src/cli/commands/chain.ts, packages/core/src/cli/index.ts
New ghx chain command supporting --steps (inline or stdin) and --check-gh-preflight; flag parsing, token handling, fetch wiring, exit codes, and tests added.
GraphQL batching & resolve helpers
packages/core/src/gql/batch.ts, packages/core/src/gql/document-registry.ts, packages/core/src/gql/resolve.ts
Added batch query builder + enhanced batch mutation support, centralized document registry (lookup & mutation maps), and resolution helpers applyInject / buildMutationVars.
Operation cards & registry types/schema
packages/core/src/core/registry/operation-card-schema.ts, packages/core/src/core/registry/types.ts, packages/core/src/core/registry/cards/*
Added graphql.resolution schema, ResolutionConfig and InjectSpec variants (scalar, map_array, input); updated multiple YAML cards to include resolution blocks.
Public exports
packages/core/src/index.ts
Re-exported executeTasks and chain-related types (ChainResultEnvelope, ChainStatus, ChainStepResult).
Docs & Plans
docs/guides/*, docs/getting-started/*, docs/architecture/*, docs/plans/*, docs/capabilities/*
Major documentation additions and updates: chaining guides, result-envelope docs, architecture diagrams, design/plans, CLI examples; capability counts adjusted (69 → 66).
Tests
packages/core/test/unit/*
New/updated tests: engine/executeTasks flows, batch builders, resolve utils, document registry, CLI chain command, envelope types, operation-card schema, and registry validation reflecting resolution semantics.
Changesets
.changeset/atomic-capability-chaining.md, .changeset/composite-capabilities-gql-integration.md
Added atomic-chaining changeset; removed/updated composite capability changelog entries.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Engine as executeTasks Engine
    participant Validator as Preflight Validator
    participant BatchBuilder as Batch Builder
    participant GraphQL as GitHub GraphQL API
    participant Resolver as Injection Resolver
    participant Envelope as Result Envelope

    Client->>Engine: executeTasks([stepA, stepB])
    Engine->>Validator: validate cards & GraphQL eligibility
    alt validation fails
        Validator-->>Engine: errors
        Engine->>Envelope: assemble failed ChainResultEnvelope
        Envelope-->>Client: status = "failed"
    else validation passes
        Note over Engine: Phase 1 — Resolution lookups
        Engine->>BatchBuilder: buildBatchQuery(lookups)
        BatchBuilder-->>Engine: lookup document + vars
        Engine->>GraphQL: execute lookup query
        GraphQL-->>Engine: lookup results
        Engine->>Resolver: applyInject(specs, lookupResults, inputs)
        Resolver-->>Engine: resolved mutation vars
        Note over Engine: Phase 2 — Mutations
        Engine->>BatchBuilder: buildBatchMutation(mutations)
        BatchBuilder-->>Engine: mutation document + vars
        Engine->>GraphQL: execute mutation batch
        GraphQL-->>Engine: mutation responses
        Engine->>Envelope: aggregate per-step results (success/partial/failed)
        Envelope-->>Client: ChainResultEnvelope(status)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 In my burrow I plan and play,
Lookups hop, then mutations sway.
Two phases, one chain — tidy and spry,
No composites — we leap and fly.
Tiny paws, big wins — chaining, hi!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly describes the main change: adding atomic capability chaining with two-phase GraphQL batching, which is the primary feature introduced in this changeset.
Description check ✅ Passed The pull request description comprehensively covers the summary, architecture, test plan, and new public exports. All major sections are well-documented and follow the template structure effectively.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/atomic-chaining

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@aryeko
Copy link
Owner Author

aryeko commented Feb 19, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bf0504ce3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

aryeko and others added 2 commits February 20, 2026 01:51
- Remove all references to deleted composite capabilities
  (pr.threads.composite, issue.triage.composite, issue.update.composite)
- Update capability count from 69 → 66 throughout (Issues 21→19, PRs 22→21)
- Add Card-Defined Resolution section to operation-cards.md explaining
  the graphql.resolution block (lookup + inject, scalar/map_array variants)
- Add new guide: docs/guides/chaining-capabilities.md covering executeTasks()
  API, ghx chain CLI, ChainResultEnvelope/ChainStatus/ChainStepResult types,
  two-phase execution model with mermaid sequence diagram, and error handling
- Update library-api.md: add executeTasks to imports, replace batch section
  with atomic chain + parallel queries sub-sections
- Update cli-usage.md: replace batch ops section with ghx chain examples
- Update result-envelope.md: add Chain Result Envelope section
- Update agent-integration.md: add executeTasks as available tool
- Update getting-started/first-task.md: add Chain Mutations Atomically section
- Update getting-started/how-it-works.md: add chaining sequence diagram
- Update repository-structure.md: remove composite.ts, add document-registry.ts
  and resolve.ts rows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d injection

Bug 1 (document-registry.ts / batch.ts): IssueCreateDocument contains a
fragment spread (...IssueCoreFields) with the fragment definition appended
as a template literal. buildBatchMutation's parseOperation only extracted
the operation body, dropping the trailing fragment definition. GitHub would
reject the batched mutation with an unknown-fragment error.

Fix: parseOperation now extracts fragment definitions that appear after the
operation's closing brace. buildBatchMutation collects unique fragments
across all operations and appends them to the composed BatchComposite document.

Bug 2 (issue.labels.add.yaml): IssueLabelsAdd requires $labelableId but
the card input uses issueId. buildMutationVars passes through input fields
by name-matching mutation variables, so labelableId was never populated,
causing all chains containing issue.labels.add to fail at batch execution.

Fix: Add InputPassthroughInject (source: "input") to the InjectSpec union,
which reads a value directly from the step input rather than from the Phase 1
lookup result. Add corresponding handling in applyInject and a new oneOf
variant in the operation card JSON schema. Update issue.labels.add.yaml to
inject labelableId from the issueId input field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aryeko and others added 2 commits February 20, 2026 02:20
Resolved conflicts from feat(core): capability namespace standardization:
- registry/index.ts: use renamed relation capability IDs
- registry/types.ts: keep InjectSpec/ResolutionConfig types (needed by chaining)
- routing/engine.ts: keep mapErrorToCode + batch/resolve imports for executeTasks
- test files: adopt renamed IDs, fix issue.labels.update → issue.labels.set in
  resolution tests, keep executeTasks chaining test suite

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ename

Update docs/capabilities/README.md to reflect all renames and additions
from the capability namespace standardization (PR #58):

- Issues: 19 → 23; fix 8 stale IDs (labels.update→labels.set,
  assignees.update→assignees.set, parent.*→relations.parent.*,
  blocked_by.*→relations.blocked_by.*, linked_prs.list→relations.prs.list,
  relations.get→relations.view); add 4 missing atomics (labels.remove,
  assignees.add, assignees.remove, milestone.clear)
- Releases: create_draft→create, get→view, publish_draft→publish
- Workflows: get→view, dispatch.run→dispatch, rerun_all→rerun.all,
  rerun_failed→rerun.failed, job.logs.get→job.logs.view
- Projects V2: 6 → 7; org.get→org.view, user.get→user.view,
  item.add_issue→items.issue.add, item.field.update→items.field.update;
  add project_v2.items.issue.remove
- Mermaid graph: update counts, remove retired Check Runs node
- Domain docs: remove Check Runs entry (0 capabilities remaining)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/getting-started/README.md (1)

87-98: ⚠️ Potential issue | 🟡 Minor

Stale per-domain counts in example CLI output.

This PR removes 2 composite Issue capabilities and 1 composite PR capability, making the sample ghx capabilities list output inconsistent with the updated totals.

📝 Proposed fix
-Issues (21)
+Issues (19)
 ├─ issue.view
 ├─ issue.create
 ├─ issue.update
 ...
 
-Pull Requests (22)
+Pull Requests (21)
 ├─ pr.view
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started/README.md` around lines 87 - 98, The sample CLI output
under the ghx capabilities list is stale: it still shows "Issues (21)" and "Pull
Requests (22)" while the PR removed 2 composite Issue capabilities and 1
composite PR capability; update the example block so the per-domain counts
reflect the removals (e.g., decrement Issues by 2 and Pull Requests by 1) and
adjust the listed entries under the "Issues (...)" and "Pull Requests (...)"
headers to match the new totals and removed items so the sample output is
consistent with the current capabilities set.
docs/architecture/repository-structure.md (1)

164-173: ⚠️ Potential issue | 🟡 Minor

New cli/commands/chain.ts module missing from the CLI & Agent Interface table.

chain.ts is a new CLI command module added in this PR. The table at these lines documents all CLI command modules but doesn't list it. Based on learnings, docs/architecture/repository-structure.md should be updated whenever module layout changes.

📝 Suggested addition to the CLI & Agent Interface table
 | `cli/commands/run.ts` | Task parsing & execution | `runCommand()` |
+| `cli/commands/chain.ts` | Chain parsing & execution | `chainCommand()` |
 | `cli/commands/setup.ts` | Skill installation | `setupCommand()` |

Based on learnings: "Update docs/architecture/repository-structure.md if architecture, module, or file layout changes."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/repository-structure.md` around lines 164 - 173, Add the
new CLI module to the CLI & Agent Interface table in
docs/architecture/repository-structure.md: insert a row for
`cli/commands/chain.ts` describing it as the chain/agent orchestration command
and reference the exported command handler (e.g., `chainCommand()` or the actual
exported function name from cli/commands/chain.ts); ensure the table format
matches surrounding rows and update any summary text that lists CLI commands so
the new module is included.
docs/guides/agent-integration.md (1)

26-28: ⚠️ Potential issue | 🟡 Minor

Two documentation inaccuracies in the updated agent guide.

  1. Lines 26–28 — stale section heading: "The Three Agent Tools" / "ghx provides three tools for agents" — there are now four numbered subsections (1. listCapabilities, 2. explainCapability, 3. executeTasks, 4. createExecuteTool). Line 172 was correctly updated to remove "three", but the heading and intro text were not.

  2. Line 82–83 — wrong round-trip count: "multi-step mutations that must share a single HTTP round-trip" contradicts the core design. Phase 1 (resolution query) + Phase 2 (mutation) = ≤ 2 HTTP round-trips, not 1. The main chaining guide (line 5) correctly says "at most 2 HTTP round-trips."

📝 Proposed fix
-## The Three Agent Tools
-
-ghx provides three tools for agents:
+## The Agent Tools
+
+ghx provides four tools for agents:
-Execute a chain of mutations atomically. For multi-step mutations that must share a single
-HTTP round-trip (e.g., update labels and assignees together), use `executeTasks` instead of
+Execute a chain of mutations atomically. For multi-step mutations that need at most 2 HTTP
+round-trips (e.g., update labels and assignees together), use `executeTasks` instead of

Also applies to: 80-85

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/agent-integration.md` around lines 26 - 28, Update the stale
heading and intro and correct the round-trip claim: replace the heading "The
Three Agent Tools" and the intro sentence "ghx provides three tools for agents"
with wording reflecting four tools (e.g., "The Agent Tools" or "The Four Agent
Tools") and list all four numbered subsections (listCapabilities,
explainCapability, executeTasks, createExecuteTool); also change the phrase
"multi-step mutations that must share a single HTTP round-trip" to state they
require at most two HTTP round-trips (e.g., "at most 2 HTTP round-trips") so it
matches the main chaining guide and the content elsewhere.
🟡 Minor comments (16)
docs/capabilities/README.md-35-35 (1)

35-35: ⚠️ Potential issue | 🟡 Minor

Table section headers are missing empty cells — MD056 violation on changed lines.

Lines 35 and 55 are section-separator rows with only 1 cell in a 3-column table. markdownlint-cli2 flags this as MD056.

🔧 Proposed fix
-| **Issues (19)** |
+| **Issues (19)** | | |
-| **Pull Requests (21)** |
+| **Pull Requests (21)** | | |

Also applies to: 55-55

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/README.md` at line 35, The table section separator rows
(e.g. the row currently showing "| **Issues (19)** |") only have one cell and
violate MD056; update those rows to include the correct number of pipe-separated
empty cells so the table has three columns (for example change "| **Issues
(19)** |" to "| **Issues (19)** |  |  |"), and apply the same fix to the similar
row at the other location referenced.
docs/architecture/operation-cards.md-120-135 (1)

120-135: ⚠️ Potential issue | 🟡 Minor

Schema example omits documentPath from the lookup block, inconsistent with the full example below.

Lines 120–135 (the "Resolution Block Schema" snippet) show a lookup without a documentPath field, while the full issue.assignees.update.yaml example at lines 146–163 includes it. This ambiguity could confuse contributors about whether documentPath is required in a lookup spec.

📝 Suggested fix — add `documentPath` to the schema snippet
     lookup:
       operationName: IssueAssigneesLookup   # registered lookup query name
+      documentPath: src/gql/operations/issue-assignees-lookup.graphql
       vars:
         issueId: issueId                    # lookup variable → input field
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/operation-cards.md` around lines 120 - 135, The "Resolution
Block Schema" snippet for the IssueAssigneesUpdate operation omits the
lookup.documentPath field which is present in the concrete example; update the
YAML schema under operationName IssueAssigneesUpdate so the lookup block
includes a documentPath entry (e.g., documentPath: node.repository) matching the
full example, ensuring the lookup specification (lookup, vars, documentPath,
inject) is consistent with the issue.assignees.update.yaml illustration.
docs/architecture/operation-cards.md-86-86 (1)

86-86: ⚠️ Potential issue | 🟡 Minor

Update capability names on line 86: issue.parent.set and issue.parent.remove do not match the actual capability IDs.

The card files define these capabilities as issue.relations.parent.set and issue.relations.parent.remove, but the capability list documents them as issue.parent.set and issue.parent.remove. These incorrect names will confuse users calling ghx capabilities explain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/operation-cards.md` at line 86, The documented capability
names in the capability list are incorrect: replace `issue.parent.set` and
`issue.parent.remove` with the actual capability IDs
`issue.relations.parent.set` and `issue.relations.parent.remove` so they match
the card definitions (ensure the entries in the capabilities list match the
capability symbols used in the card files, e.g. `issue.relations.parent.set` and
`issue.relations.parent.remove`).
packages/core/test/unit/chain-command.test.ts-113-122 (1)

113-122: ⚠️ Potential issue | 🟡 Minor

Test title says "returns 1" but the test asserts rejects.toThrow

The function throws when the token is missing (via resolveGithubToken()), it does not return 1. The test assertion is correct (rejects.toThrow), but the title is misleading — consider:

-  it("chainCommand returns 1 when GITHUB_TOKEN missing", async () => {
+  it("chainCommand throws when GITHUB_TOKEN or GH_TOKEN missing", async () => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/chain-command.test.ts` around lines 113 - 122, Update
the test title to reflect that chainCommand throws when no token is present
rather than returning 1: change the description for the test in
chain-command.test.ts to indicate it throws/missing token behavior (e.g.,
"throws when GITHUB_TOKEN missing") and ensure it references the behavior coming
from resolveGithubToken() invoked inside chainCommand; keep the existing
assertion that expects chainCommand(...) to reject with "Missing GITHUB_TOKEN or
GH_TOKEN".
docs/guides/cli-usage.md-215-215 (1)

215-215: ⚠️ Potential issue | 🟡 Minor

The general "Exit Codes" section (lines 247–250) is now inconsistent with ghx chain's partial → 0 semantics

The existing exit codes section states "0 — Command succeeded", but ghx chain exits 0 for both success and partial statuses. A script author reading only that section will incorrectly assume exit 0 means full success. Consider adding a note there:

 ## Exit Codes
 
 - `0` — Command succeeded
+   - For `ghx chain`: also returned when `status` is `"partial"` (some steps failed; inspect JSON output)
 - `1` — Error (invalid capability, parsing error, execution failure)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/cli-usage.md` at line 215, Update the "Exit Codes" section to
note that the general mapping of 0 = success is nuanced for the ghx chain
command: ghx chain returns exit code 0 for both status "success" and "partial"
(only "failed" returns 1), so add a brief sentence clarifying that scripts
should treat a "partial" status as a non-fatal but partial success case and, if
needed, inspect ghx chain's output/status field rather than relying solely on
the numeric exit code.
docs/getting-started/how-it-works.md-257-291 (1)

257-291: ⚠️ Potential issue | 🟡 Minor

Mermaid diagram shows Phase 1 as unconditional, but the description says it is optional

Lines 284–285 correctly state that Phase 1 is optional — only issued when steps need name → ID resolution. The sequence diagram, however, always shows executeTasks → GitHub: query BatchChain(...) as if Phase 1 always happens. This contradicts the text and implies two HTTP round-trips are always required.

Consider adding a guard in the diagram, e.g.:

   executeTasks->>GitHub: query BatchChain(...) { step0: ..., step1: ... }
-  Note right of GitHub: Phase 1 — resolve names to IDs<br/>(labels, assignees, milestones)
+  Note right of GitHub: Phase 1 (optional) — resolve names to IDs<br/>skipped if all inputs are already node IDs

Or use a Mermaid opt block to make the conditional explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started/how-it-works.md` around lines 257 - 291, The Mermaid
sequence diagram currently shows Phase 1 (executeTasks -> GitHub: query
BatchChain(...)) as unconditional; update the diagram to indicate Phase 1 is
conditional/optional (only run when name→ID resolution is needed). Edit the
diagram around the BatchChain query and Phase 1 note (references: executeTasks,
BatchChain, BatchComposite, "Phase 1") to either wrap the query and its notes in
a Mermaid opt block or add a guard/conditional label (e.g., "opt if resolution
needed" or "alt resolution needed / no resolution needed") so the diagram
matches the text stating Phase 1 is optional.
docs/guides/result-envelope.md-393-413 (1)

393-413: ⚠️ Potential issue | 🟡 Minor

ChainStepResult.error omits the details field present in ResultEnvelope.error

The base ResultEnvelope.error includes an optional details?: Record<string, unknown> (line 16), but ChainStepResult.error drops it. For VALIDATION-coded step failures, callers inspecting per-step errors won't have access to field-level details. If this is intentional (stripped-down per-step error), it should be noted; otherwise, consider adding details? for parity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/result-envelope.md` around lines 393 - 413, ChainStepResult.error
currently lacks the optional details field present on ResultEnvelope.error;
update the ChainStepResult.error shape to include details?: Record<string,
unknown> (or, if omission was intentional, add a clear comment in the
ChainStepResult/interface block documenting that per-step errors intentionally
omit details) so per-step failures (e.g., VALIDATION) expose field-level
details; modify the ChainStepResult interface definition where error is declared
to add the details?: Record<string, unknown> property (or add the explanatory
comment) and ensure any code that constructs ChainStepResult.error
populates/pass-through details accordingly.
packages/core/test/unit/document-registry.test.ts-1-2 (1)

1-2: ⚠️ Potential issue | 🟡 Minor

Remove unused vi import

vi is imported from vitest but not referenced anywhere in the test file — there are no vi.fn(), vi.mock(), or vi.stubEnv() calls. Biome will flag this.

-import { describe, expect, it, vi } from "vitest"
+import { describe, expect, it } from "vitest"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/document-registry.test.ts` around lines 1 - 2, The
test file imports the unused symbol `vi` from `vitest`; remove `vi` from the
import specifier so only used identifiers remain (e.g., keep `describe`,
`expect`, `it`) in the import statement that brings in `vitest` for the file
that tests `getLookupDocument` and `getMutationDocument`. Ensure there are no
other references to `vi` in this file after the change and run the linter/tests
to confirm the unused-import warning is resolved.
docs/guides/chaining-capabilities.md-3-5 (1)

3-5: ⚠️ Potential issue | 🟡 Minor

"Atomic" claim conflicts with the documented "partial" status.

The feature is labeled "Atomic chaining" and "mutations in a single logical operation," but the error-handling section (line 213–215) explicitly describes per-step failures yielding status: "partial" — meaning some mutations may succeed while others fail. This is not atomic in the transactional sense (no rollback). Consider replacing "atomic" with "batched" or "single-round-trip" to accurately reflect the behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/chaining-capabilities.md` around lines 3 - 5, The docs currently
call the feature "Atomic chaining" and describe "mutations in a single logical
operation" but the error-handling section documents per-step failures returning
status: "partial", which contradicts transactional atomicity; update the wording
in docs/guides/chaining-capabilities.md by replacing "Atomic chaining" and
related "atomic"/"single logical operation" phrasing with terms like "Batched
chaining", "single-round-trip chaining", or "chained mutations (single
round-trip)" and add a brief note near the status: "partial" explanation
clarifying there is no automatic rollback—some steps may succeed while others
fail.
docs/guides/chaining-capabilities.md-182-189 (1)

182-189: ⚠️ Potential issue | 🟡 Minor

issue.milestone.set Phase 1 column contradicts test assertions.

The table lists issue.milestone.set with Phase 1 (resolution) = "No", but both operation-card-schema.test.ts (lines 11–30) and registry-validation.test.ts (lines 57–64) explicitly assert that this card has a graphql.resolution block with a scalar lookup. The table entry should be "Yes — resolves milestone number to ID".

📝 Proposed fix
-| `issue.milestone.set` | No | Yes |
+| `issue.milestone.set` | Yes — resolves milestone number to ID | Yes |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/chaining-capabilities.md` around lines 182 - 189, The table entry
for capability 'issue.milestone.set' incorrectly marks Phase 1 as "No"; update
the Phase 1 column to "Yes — resolves milestone number to ID" to match the tests
that assert a graphql.resolution scalar lookup (see references to
operation-card-schema.test.ts and registry-validation.test.ts) so the docs
reflect actual behavior of issue.milestone.set.
packages/core/test/unit/operation-card-schema.test.ts-5-6 (1)

5-6: ⚠️ Potential issue | 🟡 Minor

Use the shared Ajv instance and apply UPPER_SNAKE_CASE to constants.

Import and reuse the shared ajv instance from ajv-instance.js instead of creating a new one. Additionally, rename the module-level constants ajv and validate to AJV and VALIDATE per coding guidelines (UPPER_SNAKE_CASE for constants).

Proposed fix
+import { ajv } from "@core/core/registry/ajv-instance.js"
 import { operationCardSchema } from "@core/core/registry/operation-card-schema.js"
-import { Ajv } from "ajv"
 import { describe, expect, it } from "vitest"

-const ajv = new Ajv()
-const validate = ajv.compile(operationCardSchema)
+const VALIDATE = ajv.compile(operationCardSchema)

Update all validate(card) references to VALIDATE(card).

Also add a test case for the input source type, which is defined in the schema's inject array oneOf options but currently untested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/operation-card-schema.test.ts` around lines 5 - 6,
The test currently creates a new Ajv instance and uses lowercase module
constants; replace creation of a new Ajv with importing and reusing the shared
instance from ajv-instance.js and rename the module-level constants to
UPPER_SNAKE_CASE (AJV and VALIDATE) so that the compiled schema uses
AJV.compile(operationCardSchema) assigned to VALIDATE, then update all calls
validate(card) → VALIDATE(card). Also add a new unit test in this file that
asserts the schema accepts an operation card whose source.type is "input"
(covering the schema's injected oneOf option for input).
packages/core/test/unit/engine.test.ts-384-436 (1)

384-436: ⚠️ Potential issue | 🟡 Minor

Misleading test name: "status is partial when one step fails" but status is actually "failed".

The test rejects the entire batch mutation via mockRejectedValueOnce. When the batch query throws, engine.ts marks all pending steps as failed, producing succeeded === 0status = "failed". The assertion at line 433 correctly expects "failed", but the test name implies a partial outcome (one step fails, the other succeeds). These should either match, or a separate test covering the true "partial" path (one step pre-result error + one successful mutation) should be added.

🔧 Minimal rename
-  it("status is partial when one step fails", async () => {
+  it("status is failed when entire batch mutation throws", async () => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/engine.test.ts` around lines 384 - 436, The test name
is misleading: executeTasks (in engine.ts) currently marks the whole batch as
failed when the batch query rejects via
githubClient.query.mockRejectedValueOnce, so update the test to reflect that
behavior by renaming the it(...) description from "status is partial when one
step fails" to something like "status is failed when batch query rejects" (or
alternatively modify the mock to return a resolved batch response with one step
error to test the true partial path); locate the test around executeTasks usage
and the githubClient.query mock and either change the it(...) string or change
the mock behavior to a resolved value containing per-step results to exercise
the partial-status logic.
packages/core/src/gql/resolve.ts-67-73 (1)

67-73: ⚠️ Potential issue | 🟡 Minor

null extract values silently pass through the undefined check.

If a lookup node has a matching match_field but the extract_field is null, idByName stores null. Line 71 only checks id === undefined, so null IDs will be included in the resolved array and passed to the mutation.

🛡️ Proposed fix: also guard against null
     const id = idByName.get(name.toLowerCase())
-    if (id === undefined) throw new Error(`Resolution: '${name}' not found in lookup result`)
+    if (id === undefined || id === null)
+      throw new Error(`Resolution: '${name}' not found in lookup result`)
     return id
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/gql/resolve.ts` around lines 67 - 73, The mapping in
resolved currently only treats undefined as missing, but idByName can contain
null values (from extract_field being null) which then get passed through;
update the check inside the map in resolve.ts (the resolved creation block
referencing idByName and spec.from_input) to treat null the same as undefined
(e.g., guard with id == null or id === undefined || id === null) and throw the
same Resolution error when id is null so null IDs are not included in the
resolved array.
docs/guides/library-api.md-190-196 (1)

190-196: ⚠️ Potential issue | 🟡 Minor

Documentation says "single HTTP round-trip" but the feature uses up to two.

Lines 192–193 and the callout on line 196 state "mutations that must share a single HTTP round-trip," but the actual design (and line 333) correctly describes ≤2 round-trips (Phase 1: resolution queries, Phase 2: mutations). This wording could mislead consumers into expecting a single network call in all cases.

📝 Suggested wording
-### Atomic chain (mutations)
+### Atomic chain

-For mutations that must share a single HTTP round-trip — e.g., updating labels and assignees
-on the same issue — use `executeTasks()`. All steps are pre-flight validated before any HTTP
-call is made, and the chain executes in at most 2 network round-trips.
+For mutations that should execute atomically — e.g., updating labels and assignees
+on the same issue — use `executeTasks()`. All steps are pre-flight validated before any HTTP
+call is made, and the chain executes in at most **2** network round-trips.

-> **For mutations that must share a single HTTP round-trip, use `executeTasks`.**
+> **For batched mutations with at most 2 HTTP round-trips, use `executeTasks`.**
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/library-api.md` around lines 190 - 196, The docs incorrectly
claim executeTasks provides a "single HTTP round-trip"; update the phrasing
around the "Atomic chain (mutations)" section and the callout that references
executeTasks so it accurately states that chains are atomic but may use up to
two network round-trips (phase 1: resolution queries, phase 2: mutations).
Locate the references to "single HTTP round-trip" and replace with wording such
as "atomic across at most two HTTP round-trips" or "atomic (performed in up to
two network round-trips)" and ensure the callout and any nearby explanatory
sentence mention the two-phase behavior tied to executeTasks.
packages/core/src/core/routing/engine.ts-213-216 (1)

213-216: ⚠️ Potential issue | 🟡 Minor

No guard for an empty requests array.

If executeTasks([]) is called, the function skips the 1-item branch and falls through to preflight with an empty loop, then returns status: "success" with 0 results and succeeded: 0. Consider an explicit guard at the top:

🛡️ Proposed fix
 export async function executeTasks(
   requests: Array<{ task: string; input: Record<string, unknown> }>,
   deps: ExecutionDeps,
 ): Promise<ChainResultEnvelope> {
+  if (requests.length === 0) {
+    return {
+      status: "failed",
+      results: [],
+      meta: { route_used: "graphql", total: 0, succeeded: 0, failed: 0 },
+    }
+  }
+
   // 1-item: delegate to existing routing engine
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/routing/engine.ts` around lines 213 - 216, The
function executeTasks currently lacks a guard for an empty requests array; add
an explicit top-level check inside executeTasks that if requests.length === 0
immediately returns a ChainResultEnvelope with status "success" (or the
appropriate neutral status used elsewhere), results as an empty array, and
succeeded: 0 so the preflight/loop is skipped and callers receive a clear,
consistent response; reference the executeTasks function and the
ChainResultEnvelope return shape when implementing this early-return guard.
packages/core/src/core/routing/engine.ts-213-251 (1)

213-251: ⚠️ Potential issue | 🟡 Minor

meta.route_used is hardcoded to "graphql" even when the 1-item path may use CLI.

The single-item delegation to executeTask goes through full routing, which can select CLI. The returned ChainResultEnvelope always reports route_used: "graphql". Consider propagating the actual route from result.meta.route_used:

       meta: {
-        route_used: "graphql",
+        route_used: (result.meta.route_used as "graphql") ?? "graphql",
         total: 1,

Or, since ChainResultEnvelope.meta.route_used is typed as "graphql" literal, acknowledge this is by-design and add a brief comment explaining the 1-item path may actually use CLI but the chain envelope always reports "graphql".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/routing/engine.ts` around lines 213 - 251, In
executeTasks, the 1-item fast-path always sets
ChainResultEnvelope.meta.route_used to the literal "graphql" even though
executeTask may route via CLI; update the returned meta to propagate the actual
route by using result.meta?.route_used (falling back to "graphql" if absent) so
route_used reflects the real path, and ensure this change touches the
executeTasks function and uses the executeTask result's meta.route_used; if the
ChainResultEnvelope type truly restricts meta.route_used to the "graphql"
literal and cannot be changed, instead add a concise comment inside executeTasks
explaining that the single-item delegation may invoke non-graphql routing (e.g.,
CLI) but the envelope type currently reports "graphql" by design.
🧹 Nitpick comments (8)
docs/architecture/README.md (1)

19-19: LGTM! Count update is correct.

Optionally, since executeTasks() is now a first-class surface entry point alongside execute(), you may want to add it to the Agent Surface subgraph of the mermaid diagram for discoverability — or add a callout linking to system-design.md where the two-phase flow is already diagrammed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/README.md` at line 19, The diagram mention should include
the new entry point executeTasks() alongside execute() for discoverability:
update the Agent Surface subgraph in the mermaid diagram to add an explicit node
or edge for executeTasks(), or alternatively add a short callout in README
linking to system-design.md where the two-phase flow is already diagrammed,
referencing executeTasks() and execute() by name so readers can find the
implementation easily.
packages/core/test/unit/envelope.test.ts (1)

9-28: Tests are compile-time type guards with trivially true runtime assertions.

All three expect calls verify properties that are structurally guaranteed by the assignments immediately above them (e.g., an array built from three literals always has length 3; r.ok always equals what was just assigned). The real guard here is TypeScript's type-checker, not the runtime assertions.

Consider replacing or augmenting with tests that verify actual narrowing or rejection behaviour — e.g., confirming "invalid" is not assignable, or that executeTasks (when testable) returns a ChainResultEnvelope with the expected status/meta under known inputs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/envelope.test.ts` around lines 9 - 28, The tests in
envelope.test.ts currently only exercise TypeScript compile-time checks
(ChainStatus, ChainStepResult, ChainResultEnvelope) but use trivial runtime
expects; replace these with meaningful checks: add `@ts-expect-error` assertions
to verify invalid literals (e.g., assigning "invalid" to ChainStatus) to test
compile-time rejection, and replace the trivial runtime expects with actual
runtime behavior tests — for example call the function that produces a
ChainResultEnvelope (e.g., executeTasks or the module function that
builds/validates envelopes) and assert the returned object's status and meta
values under known input scenarios; update tests referencing ChainStatus,
ChainStepResult, and ChainResultEnvelope accordingly so they validate real
narrowing/rejection and runtime output instead of tautological assertions.
packages/core/test/unit/chain-command.test.ts (1)

43-122: Add afterEach(() => vi.unstubAllEnvs()) to prevent environment stub leakage between tests

The vitest config does not enable unstubGlobals: true, so vi.stubEnv() stubs are not automatically restored between tests. Tests 1–3 stub GITHUB_TOKEN to "test-token" and test 4 stubs both GITHUB_TOKEN and GH_TOKEN to undefined; without explicit cleanup via vi.unstubAllEnvs(), these stubs persist into subsequent tests and can cause false failures in watch mode.

+import { afterEach, describe, expect, it, vi } from "vitest"
+
 describe("chainCommand parsing", () => {
+  afterEach(() => {
+    vi.unstubAllEnvs()
+  })
+
   it("parseChainFlags extracts inline --steps", ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/chain-command.test.ts` around lines 43 - 122, Add an
afterEach hook to call vi.unstubAllEnvs() so env stubs created by vi.stubEnv()
in tests like the ones invoking chainCommand are cleared between tests; locate
the test suite in chain-command.test.ts (the describe/it block containing the
chainCommand tests) and add afterEach(() => vi.unstubAllEnvs()) so tests that
set GITHUB_TOKEN/GH_TOKEN do not leak into subsequent cases.
packages/core/src/core/contracts/envelope.ts (1)

56-65: Consider extracting the inline meta type as a named interface.

The meta type inside ChainResultEnvelope is anonymous, inconsistent with the existing named ResultMeta. Extracting it improves discoverability and allows independent reference (e.g., in tests or engine return types).

♻️ Proposed refactor
+export interface ChainResultMeta {
+  route_used: "graphql"
+  total: number
+  succeeded: number
+  failed: number
+}
+
 export interface ChainResultEnvelope {
   status: ChainStatus
   results: ChainStepResult[]
-  meta: {
-    route_used: "graphql"
-    total: number
-    succeeded: number
-    failed: number
-  }
+  meta: ChainResultMeta
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/contracts/envelope.ts` around lines 56 - 65, The
inline anonymous meta object in ChainResultEnvelope should be extracted into a
named interface (e.g., ChainResultMeta) or reuse the existing ResultMeta if they
are equivalent; update the ChainResultEnvelope definition to reference that
named interface for the meta property and adjust any imports/exports and usages
(tests, return types, etc.) accordingly so the meta shape is discoverable and
reusable across ResultMeta, ChainResultEnvelope, and other callers.
packages/core/test/unit/cli-index.test.ts (1)

5-19: Missing delegation test for the chain command.

Every other command has a corresponding delegation test (e.g., "delegates run command to runCommand", "delegates setup command to setupCommand"), but chainCommandMock is set up without a matching test that verifies main(["chain", "--steps", "..."]) routes to chainCommandMock. Add a test for consistency and coverage.

🧪 Suggested test to add
+  it("delegates chain command to chainCommand", async () => {
+    chainCommandMock.mockResolvedValue(4)
+
+    const code = await main(["chain", "--steps", "[{}]"])
+
+    expect(code).toBe(4)
+    expect(chainCommandMock).toHaveBeenCalledWith(["--steps", "[{}]"])
+  })

Also applies to: 36-36

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-index.test.ts` around lines 5 - 19, Add a unit
test that verifies the CLI delegates the "chain" subcommand to the chain command
mock: create a test similar to the existing delegation tests that calls
main(["chain", "--steps", "a,b"] or the same args pattern used elsewhere, then
assert that chainCommandMock was called with the expected parsed args; place the
test alongside the other delegation tests in
packages/core/test/unit/cli-index.test.ts and reference the existing
chainCommandMock and main() symbols to locate where to add it.
packages/core/src/gql/batch.ts (1)

151-189: buildBatchQuery duplicates the variable-aliasing and body-rewriting logic from buildBatchMutation.

The two functions share identical loops for: variable declaration prefixing, sorted replaceAll substitutions, top-level alias injection, and variable object merging. The only differences are the wrapper keyword (query BatchChain vs mutation BatchComposite) and fragment collection. Extracting a shared buildBatchOperation helper would halve the surface for future bugs like the fragment regex above.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/gql/batch.ts` around lines 151 - 189, buildBatchQuery
duplicates the variable-aliasing and body-rewriting logic from
buildBatchMutation (prefixing variable declarations, sorted replaceAll
substitutions, top-level alias injection, and merged variable mapping), which
risks divergence and bugs; create a shared helper (e.g., buildBatchOperation)
that accepts the operations array plus the wrapper type/name and fragment
handling strategy, move the common loops (parseOperation, variable declaration
prefixing using op.alias, sorted variable replaceAll, top-level alias injection,
and mergedVariables population) into that helper, and have buildBatchQuery and
buildBatchMutation call it with their respective wrapper strings ("query
BatchChain" vs "mutation BatchComposite") and fragment collection differences to
eliminate duplication and centralize the fragment regex behavior.
packages/core/test/unit/batch.test.ts (1)

73-109: Fragment tests should include nested selection set coverage to prevent regressions.

The regex in parseOperation (batch.ts line 140, /fragment\s+(\w+)\s+on\s+\w+\s*\{[^}]*\}/g) uses [^}]* and cannot handle fragments with nested braces. While all current production fragments (IssueCoreFieldsFragmentDoc, PrCoreFieldsFragmentDoc, PageInfoFieldsFragmentDoc) contain only scalar fields and work fine, adding a test case with nested selections would guard against regressions if fragments evolve to include nested selection sets.

➕ Suggested additional test
+  it("preserves fragment with nested selection sets", () => {
+    const mutWithNestedFragment = `mutation CreateIssue($repositoryId: ID!, $title: String!) {
+  createIssue(input: {repositoryId: $repositoryId, title: $title}) {
+    issue { ...IssueFullFields }
+  }
+}
+fragment IssueFullFields on Issue {
+  id
+  labels(first: 10) {
+    nodes {
+      id
+      name
+    }
+  }
+}`
+    const result = buildBatchMutation([
+      { alias: "step0", mutation: mutWithNestedFragment, variables: { repositoryId: "R_1", title: "T" } },
+    ])
+    expect(result.document).toContain("fragment IssueFullFields on Issue")
+    expect(result.document).toContain("nodes {")
+  })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/batch.test.ts` around lines 73 - 109, Update the unit
tests in packages/core/test/unit/batch.test.ts to add a fragment that contains a
nested selection set (i.e., fields with their own {...} sub-selections) and
assert that buildBatchMutation still preserves the fragment
(expect(result.document).toContain("...FragmentName") and the full "fragment
FragmentName on Type" text) and that shared nested fragments are still
deduplicated (count of fragment declarations is 1); this targets parseOperation
(the regex in parseOperation in batch.ts) to catch regressions from its current
`/fragment\s+(\w+)\s+on\s+\w+\s*\{[^}]*\}/g` pattern when fragments include
nested braces.
packages/core/src/core/routing/engine.ts (1)

253-287: Preflight pushes cards only on success — index misalignment with requests array.

When a preflight validation fails for step i, cards.push(card) is skipped, so cards[i] no longer corresponds to requests[i] for subsequent steps. Phase 1 and Phase 2 then use cards[i] / requests[i] together, causing wrong card–request pairing for all steps after the first failure.

This is currently masked because any preflight error causes an early return (lines 289–307), so phases 1 & 2 are never reached with a misaligned cards array. However, if the early-return logic ever changes (e.g., to allow partial preflight failures), this would become a data-corruption bug. Consider using a Map<number, OperationCard> indexed by step position, consistent with the suggested fix above.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/routing/engine.ts` around lines 253 - 287, The
preflight loop builds a dense cards array (cards.push(card)) which can
desynchronize indexes with requests when a step fails; change to record cards by
request index so later phases can reliably pair each request with its card:
instead of pushing to cards, use a Map<number, OperationCard> (or an array
initialized to requests.length with nullable slots) keyed by the loop index so
getOperationCard results are stored at the same position as the corresponding
request; update uses of cards elsewhere (phases 1 & 2) to fetch by index (e.g.,
cardsMap.get(i) or cards[i]) and preserve existing error handling
(preflightErrors, validateInput, mapErrorToCode) behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/cli/commands/chain.ts`:
- Around line 73-111: executeGraphqlRequest currently uses a bare fetch and
directly calls response.json(), which can hang and/or throw on non-JSON
responses; update executeGraphqlRequest to pass an AbortSignal with a 10_000ms
timeout (use AbortSignal.timeout(10000)) to the fetch call to avoid indefinite
hangs, and wrap response.json() in a try/catch: if JSON parse fails, read
response.text() and include that text (or a truncated version) in the thrown
Error along with response.status; also preserve the existing payload.errors and
payload.message checks when JSON parse succeeds so meaningful GitHub error
messages are still surfaced.
- Around line 113-139: chainCommand currently lets errors from parseChainFlags,
parseJsonSteps, resolveGithubToken (and generally async helpers) bubble as
unhandled rejections; wrap the main logic in a try/catch so any thrown Error is
caught, log the error message (use process.stderr.write or process.stdout.write)
and return exit code 1. Specifically, surround the calls to parseChainFlags,
parseJsonSteps (and readStdin path), resolveGithubToken,
createGithubClient/executeTasks invocation with a single try block and in the
catch write the error.message (and optional error.stack for debug) and return 1
so chainCommand always resolves to a numeric exit code.

In `@packages/core/src/core/routing/engine.ts`:
- Around line 289-307: The preflight error lookup incorrectly matches by
capability name (task) causing duplicate-capability steps to inherit the same
error; change the mapping to use the request index instead: in the requests.map
call use requests.map((req, i) => ...) and look up errors by a step index (e.g.,
preflightErrors.find(e => e.stepIndex === i) or simply preflightErrors[i] if
preflightErrors is an array keyed by step), and ensure preflight error objects
include that step index when they are created earlier; fall back to the same
default error object when no index match is found but preserve req.task in the
returned object.

In `@packages/core/src/gql/batch.ts`:
- Around line 137-148: The current fragment extraction uses a regex
(/fragment\s+(\w+)\s+on\s+\w+\s*\{[^}]*\}/g) that fails on nested braces;
replace it with a brace-depth parser: scan the remainder string for the
"fragment" keyword and capture the fragment name (fragName) and the index of the
opening '{', then walk forward counting brace depth until the matching closing
'}' is found to slice the full fragment text; for each found fragment, trim and
store it in the fragments Map (same checks as before to avoid duplicates using
fragments.has(fragName)); mirror the brace-depth logic used for extracting
body/variableDeclarations so nested selections like "labels { nodes { id } }"
are captured completely.

In `@packages/core/test/unit/engine.test.ts`:
- Around line 320-382: The test is incorrectly using vi.doMock after the engine
module (executeTasks) has already been imported so its internal bindings to
getLookupDocument, getMutationDocument, buildBatchMutation and applyInject
remain the real implementations; fix by calling vi.resetModules() in the
describe block's beforeEach and re-importing the engine (executeTasks) inside
each test after setting up vi.doMock for "@core/gql/document-registry.js" and
"@core/gql/batch.js" (and "@core/gql/resolve.js" only where needed), and remove
the unused applyInjectMock and its vi.doMock from the 2-item pure-mutation test
since cardWithGql has no resolution field and applyInject is never invoked.

---

Outside diff comments:
In `@docs/architecture/repository-structure.md`:
- Around line 164-173: Add the new CLI module to the CLI & Agent Interface table
in docs/architecture/repository-structure.md: insert a row for
`cli/commands/chain.ts` describing it as the chain/agent orchestration command
and reference the exported command handler (e.g., `chainCommand()` or the actual
exported function name from cli/commands/chain.ts); ensure the table format
matches surrounding rows and update any summary text that lists CLI commands so
the new module is included.

In `@docs/getting-started/README.md`:
- Around line 87-98: The sample CLI output under the ghx capabilities list is
stale: it still shows "Issues (21)" and "Pull Requests (22)" while the PR
removed 2 composite Issue capabilities and 1 composite PR capability; update the
example block so the per-domain counts reflect the removals (e.g., decrement
Issues by 2 and Pull Requests by 1) and adjust the listed entries under the
"Issues (...)" and "Pull Requests (...)" headers to match the new totals and
removed items so the sample output is consistent with the current capabilities
set.

In `@docs/guides/agent-integration.md`:
- Around line 26-28: Update the stale heading and intro and correct the
round-trip claim: replace the heading "The Three Agent Tools" and the intro
sentence "ghx provides three tools for agents" with wording reflecting four
tools (e.g., "The Agent Tools" or "The Four Agent Tools") and list all four
numbered subsections (listCapabilities, explainCapability, executeTasks,
createExecuteTool); also change the phrase "multi-step mutations that must share
a single HTTP round-trip" to state they require at most two HTTP round-trips
(e.g., "at most 2 HTTP round-trips") so it matches the main chaining guide and
the content elsewhere.

---

Nitpick comments:
In `@docs/architecture/README.md`:
- Line 19: The diagram mention should include the new entry point executeTasks()
alongside execute() for discoverability: update the Agent Surface subgraph in
the mermaid diagram to add an explicit node or edge for executeTasks(), or
alternatively add a short callout in README linking to system-design.md where
the two-phase flow is already diagrammed, referencing executeTasks() and
execute() by name so readers can find the implementation easily.

In `@packages/core/src/core/contracts/envelope.ts`:
- Around line 56-65: The inline anonymous meta object in ChainResultEnvelope
should be extracted into a named interface (e.g., ChainResultMeta) or reuse the
existing ResultMeta if they are equivalent; update the ChainResultEnvelope
definition to reference that named interface for the meta property and adjust
any imports/exports and usages (tests, return types, etc.) accordingly so the
meta shape is discoverable and reusable across ResultMeta, ChainResultEnvelope,
and other callers.

In `@packages/core/src/core/routing/engine.ts`:
- Around line 253-287: The preflight loop builds a dense cards array
(cards.push(card)) which can desynchronize indexes with requests when a step
fails; change to record cards by request index so later phases can reliably pair
each request with its card: instead of pushing to cards, use a Map<number,
OperationCard> (or an array initialized to requests.length with nullable slots)
keyed by the loop index so getOperationCard results are stored at the same
position as the corresponding request; update uses of cards elsewhere (phases 1
& 2) to fetch by index (e.g., cardsMap.get(i) or cards[i]) and preserve existing
error handling (preflightErrors, validateInput, mapErrorToCode) behavior.

In `@packages/core/src/gql/batch.ts`:
- Around line 151-189: buildBatchQuery duplicates the variable-aliasing and
body-rewriting logic from buildBatchMutation (prefixing variable declarations,
sorted replaceAll substitutions, top-level alias injection, and merged variable
mapping), which risks divergence and bugs; create a shared helper (e.g.,
buildBatchOperation) that accepts the operations array plus the wrapper
type/name and fragment handling strategy, move the common loops (parseOperation,
variable declaration prefixing using op.alias, sorted variable replaceAll,
top-level alias injection, and mergedVariables population) into that helper, and
have buildBatchQuery and buildBatchMutation call it with their respective
wrapper strings ("query BatchChain" vs "mutation BatchComposite") and fragment
collection differences to eliminate duplication and centralize the fragment
regex behavior.

In `@packages/core/test/unit/batch.test.ts`:
- Around line 73-109: Update the unit tests in
packages/core/test/unit/batch.test.ts to add a fragment that contains a nested
selection set (i.e., fields with their own {...} sub-selections) and assert that
buildBatchMutation still preserves the fragment
(expect(result.document).toContain("...FragmentName") and the full "fragment
FragmentName on Type" text) and that shared nested fragments are still
deduplicated (count of fragment declarations is 1); this targets parseOperation
(the regex in parseOperation in batch.ts) to catch regressions from its current
`/fragment\s+(\w+)\s+on\s+\w+\s*\{[^}]*\}/g` pattern when fragments include
nested braces.

In `@packages/core/test/unit/chain-command.test.ts`:
- Around line 43-122: Add an afterEach hook to call vi.unstubAllEnvs() so env
stubs created by vi.stubEnv() in tests like the ones invoking chainCommand are
cleared between tests; locate the test suite in chain-command.test.ts (the
describe/it block containing the chainCommand tests) and add afterEach(() =>
vi.unstubAllEnvs()) so tests that set GITHUB_TOKEN/GH_TOKEN do not leak into
subsequent cases.

In `@packages/core/test/unit/cli-index.test.ts`:
- Around line 5-19: Add a unit test that verifies the CLI delegates the "chain"
subcommand to the chain command mock: create a test similar to the existing
delegation tests that calls main(["chain", "--steps", "a,b"] or the same args
pattern used elsewhere, then assert that chainCommandMock was called with the
expected parsed args; place the test alongside the other delegation tests in
packages/core/test/unit/cli-index.test.ts and reference the existing
chainCommandMock and main() symbols to locate where to add it.

In `@packages/core/test/unit/envelope.test.ts`:
- Around line 9-28: The tests in envelope.test.ts currently only exercise
TypeScript compile-time checks (ChainStatus, ChainStepResult,
ChainResultEnvelope) but use trivial runtime expects; replace these with
meaningful checks: add `@ts-expect-error` assertions to verify invalid literals
(e.g., assigning "invalid" to ChainStatus) to test compile-time rejection, and
replace the trivial runtime expects with actual runtime behavior tests — for
example call the function that produces a ChainResultEnvelope (e.g.,
executeTasks or the module function that builds/validates envelopes) and assert
the returned object's status and meta values under known input scenarios; update
tests referencing ChainStatus, ChainStepResult, and ChainResultEnvelope
accordingly so they validate real narrowing/rejection and runtime output instead
of tautological assertions.

aryeko and others added 6 commits February 20, 2026 02:35
…solution

- Add resolution block to pr.reviews.submit.yaml: looks up pullRequestId
  via PrNodeId query (owner/name/prNumber → repository.pullRequest.id),
  injects pullRequestId into PrReviewSubmit mutation
- Register PrNodeId in LOOKUP_DOCUMENTS
- Register PrReviewSubmit in MUTATION_DOCUMENTS

pr.reviews.submit is now the 7th fully-wired batchable capability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Captures 8 areas of follow-up work identified after implementation:
SKILL.md updates, benchmark scenarios, GQL partial error handling,
cross-step data passing, CLI-only mutation expansion, resolution caching,
test coverage gaps, and pr.reviews.submit path validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…olution tests

- Replace ineffective vi.doMock tests with vi.resetModules() + dynamic import
  pattern so mocks actually intercept module-level singletons
- Add assertion that getMutationDocumentMock and buildBatchMutationMock are
  called (proves mocks are effective, not just that the function returns)
- Add CR-3 test: same capability twice, only one fails pre-flight → error
  attributed to the correct step index (not matched by task name)
- Add CL-4 test: lookup var missing from input → pre-flight rejects that step
- Add CL-7 test: mixed-resolution chain (step 0 has no resolution, step 1
  requires Phase 1 lookup) → both phases called, both steps succeed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add JSDoc comments to ScalarInject, MapArrayInject, and InputPassthroughInject
interfaces explaining when each is used, with YAML examples showing the
required fields. Expand the inject source variants table in operation-cards.md
to cover all three source types (including the previously undocumented `input`
passthrough) and add a tip recommending `source: input` when the caller already
has the node ID.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (4)
docs/capabilities/README.md (1)

53-54: Consider whether blocked_by underscore aligns with the dotted-naming convention.

issue.relations.blocked_by.add and issue.relations.blocked_by.remove use an underscore within a capability-ID segment. The established naming guideline prefers dotted capability names, and the blocked_by compound can read as a domain-separator underscore to new contributors. An alternative like issue.relations.blocking.add / issue.relations.blocking.remove (using the active form) would be fully dot-separated and symmetric with issue.relations.parent.*.

If blocked_by is intentional (e.g., to mirror GitHub's own terminology), a brief inline comment in a naming-conventions doc would help prevent future drift.

Based on learnings: "Adopt dotted capability names (domain.resource.action) consistently in both docs and code."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/README.md` around lines 53 - 54, The capability IDs
`issue.relations.blocked_by.add` and `issue.relations.blocked_by.remove` use an
underscore inside a segment which breaks the dotted naming convention; update
these to a fully dot-separated form such as `issue.relations.blocking.add` and
`issue.relations.blocking.remove` (and update any references/tests) OR, if
`blocked_by` is intentionally mirroring existing external terminology, add a
brief inline note to the naming-conventions documentation explaining why
`blocked_by` is allowed to prevent future drift; ensure you update the README
entries for `issue.relations.blocked_by.add`/`remove` (or replace them with
`issue.relations.blocking.add`/`remove`) and any code/tests that reference those
capability IDs.
packages/core/src/core/routing/engine.ts (1)

360-381: Phase 1 and Phase 2 batch failures unconditionally set retryable: true, even for auth/validation errors.

Lines 371 and 449: when the entire batch query/mutation fails, the error is mapped via mapErrorToCode (which correctly identifies AUTH, VALIDATION, etc.), but retryable is hardcoded to true. Auth or validation failures should not be retryable. Consider deriving retryability from the error code.

♻️ Derive retryability from error code
+const RETRYABLE_CODES = new Set([errorCodes.RateLimit, errorCodes.Server, errorCodes.Network, errorCodes.Unknown])
+
 // In Phase 1 catch:
+  const code = mapErrorToCode(err)
   return {
     status: "failed",
     results: requests.map((req) => ({
       task: req.task,
       ok: false,
       error: {
-        code: mapErrorToCode(err),
+        code,
         message: `Phase 1 (resolution) failed: ${errorMsg}`,
-        retryable: true,
+        retryable: RETRYABLE_CODES.has(code),
       },
     })),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/routing/engine.ts` around lines 360 - 381, The
batch-failure paths in engine.ts currently hardcode retryable: true; change both
Phase 1 (catch block around mapErrorToCode) and the Phase 2 batch-failure block
to derive retryability from the mapped error code instead of hardcoding it: call
mapErrorToCode(err) once into a const (e.g., const code = mapErrorToCode(err)),
compute retryable based on that code (e.g., treat AUTH and VALIDATION codes as
non-retryable, or use a small helper isRetryableErrorCode(code)), and set the
error.retryable field to that computed boolean for each result; update both the
catch block that returns status "failed" and the corresponding Phase 2 failure
response to use the same logic.
packages/core/test/unit/engine.test.ts (1)

512-627: Thorough mixed-resolution chain test covering both Phase 1 and Phase 2.

This test exercises the full two-phase flow: step 0 has no resolution (pure mutation), step 1 requires a Phase 1 lookup with map_array injection. The queryMock is sequenced to return lookup data first, then mutation data. Both buildBatchQueryMock and buildBatchMutationMock are verified as called.

One suggestion: consider adding an assertion that queryMock was called exactly twice (once for Phase 1, once for Phase 2), to confirm the expected two-round-trip contract.

🧪 Optional: verify round-trip count
     expect(buildBatchQueryMock).toHaveBeenCalled()
     expect(buildBatchMutationMock).toHaveBeenCalled()
+    expect(queryMock).toHaveBeenCalledTimes(2)
     expect(result.status).toBe("success")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/engine.test.ts` around lines 512 - 627, Add an
assertion verifying the GraphQL client was invoked twice to assert the two-phase
round-trip: after calling executeTasks in the "executeTasks — mixed resolution
chain" test, add expect(queryMock).toHaveBeenCalledTimes(2); locate the
queryMock variable in this test (the mocked github client's query) and place the
assertion alongside the existing expectations that check
buildBatchQueryMock/buildBatchMutationMock and result contents.
packages/core/src/core/registry/types.ts (1)

11-85: Well-structured injection types with clear documentation.

The three inject variants (ScalarInject, MapArrayInject, InputPassthroughInject) and their union InjectSpec are cleanly modelled with helpful JSDoc examples. The ResolutionConfig type correctly ties lookup specs to inject specs.

One minor observation: the docs (operation-cards.md) reference a named LookupSpec type, but here the lookup shape is inlined inside ResolutionConfig. Consider extracting it as a named interface for consistency with the documentation and for reuse in validation or test code.

♻️ Optional: extract LookupSpec as a named interface
+export interface LookupSpec {
+  operationName: string
+  documentPath: string
+  vars: Record<string, string>
+}
+
 export interface ResolutionConfig {
-  lookup: {
-    operationName: string
-    documentPath: string
-    vars: Record<string, string>
-  }
+  lookup: LookupSpec
   inject: InjectSpec[]
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/types.ts` around lines 11 - 85,
ResolutionConfig currently inlines the lookup shape; extract that inline object
into a named interface (e.g., LookupSpec) and reference it from ResolutionConfig
to match docs and enable reuse. Define interface LookupSpec { operationName:
string; documentPath: string; vars: Record<string,string> } and replace the
inline lookup type in ResolutionConfig with lookup: LookupSpec; update any
references to ResolutionConfig if needed (symbols: ResolutionConfig, LookupSpec,
operationName, documentPath, vars).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture/operation-cards.md`:
- Around line 38-41: Docs and code disagree on naming: the docs and diagram use
ResolutionBlock and LookupSpec while the TypeScript type is named
ResolutionConfig with an inline lookup; update either the docs or the code so
names match. Fix options: rename the TypeScript type ResolutionConfig to
ResolutionBlock (and export it) and extract the inline lookup into a LookupSpec
type, keeping the inject: InjectSpec[] property, or conversely rename the
docs/diagram to ResolutionConfig/inline lookup; ensure the symbols
ResolutionBlock (or ResolutionConfig), LookupSpec, lookup and InjectSpec in the
registry types (e.g., in the type declarations currently in
packages/core/src/core/registry/types.ts) are consistent and exported so the
documentation and code align.

In `@docs/capabilities/README.md`:
- Line 43: Search and replace all occurrences of the old capability IDs
`issue.labels.update` → `issue.labels.set` and `issue.assignees.update` →
`issue.assignees.set` across the listed documentation files
(packages/core/README.md, docs/capabilities/issues.md,
docs/architecture/operation-cards.md, docs/guides/library-api.md,
docs/guides/result-envelope.md, docs/guides/cli-usage.md,
docs/guides/chaining-capabilities.md, docs/getting-started/README.md,
docs/getting-started/first-task.md, docs/benchmark/workflow-roadmap.md),
ensuring table entries, examples, code snippets, and any capability registry
references use the new IDs exactly (`issue.labels.set`, `issue.assignees.set`)
and run a quick grep to confirm no remaining references to the old IDs.

In `@docs/plans/2026-02-20-atomic-chaining-followup.md`:
- Around line 39-41: The benchmark references outdated capability names: replace
`issue.labels.set` and `issue.assignees.set` with the updated capability names
`issue.labels.update` and `issue.assignees.update` in the
`issue-triage-atomic-wf-001` scenario so the chain uses `issue.labels.update`,
`issue.assignees.update` and `issue.milestone.set` (if milestone remains `set`)
to avoid capability-not-found errors.
- Around line 104-110: The table lists "CLI-only mutations that cannot be
chained" but includes `issue.comments.create` which is already batchable; remove
the `issue.comments.create` row from the table (the "| `issue.comments.create`
(already batchable) | ✓ done |" line) so the candidate list only contains actual
non-batchable mutations, and update any surrounding text or summary to reflect
that comment creation is already addressed.
- Around line 52-68: The GraphQL batch path must stop throwing on HTTP-200
partial errors and instead return raw {data, errors} so we can mark per-step
outcomes: change GithubClient.query (or add a lower-level githubRawQuery) to
return the response object rather than throwing on errors[], update chain.ts
executeGraphqlRequest to consume that raw response and map each error.path (e.g.
["step1","createIssue"]) back to the aliased step id to mark that step ok: false
while copying successful step data into rawMutResult, and update engine.ts Phase
2 logic that sets ChainStatus so that if at least one step succeeded you set
ChainStatus = "partial" (only set "failed" when all steps failed) and preserve
successful step data instead of discarding it.

In `@packages/core/src/cli/commands/chain.ts`:
- Around line 1-3: The imports in this file use relative paths across multiple
directory levels; change the cross-directory imports to use the project path
alias so they map to packages/core/src/*: replace the relative imports for
executeTasks (currently from "../../core/routing/engine.js") and
createGithubClient (currently from "../../gql/github-client.js") with the
corresponding alias imports (e.g., "@core/routing/engine.js" and
"@core/gql/github-client.js") while leaving local imports like readStdin
unchanged; ensure the import specifiers match existing alias configuration and
retain the same exported symbols (executeTasks, createGithubClient).

In `@packages/core/src/core/routing/engine.ts`:
- Around line 457-472: The current assembly in the requests.map can mark a step
ok: true when rawMutResult[mutInput.alias] is undefined; update the logic in the
results construction (the requests.map block that references stepPreResults,
mutationInputs, mutInput.alias and rawMutResult) to detect a missing alias key
(use `!(mutInput.alias in rawMutResult)` or
Object.prototype.hasOwnProperty.call(rawMutResult, mutInput.alias)) and treat
that as a failure — return { task: req.task, ok: false, error: { code:
errorCodes.Unknown, message: "missing mutation result for alias <alias>",
retryable: false } } — otherwise set data = rawMutResult[mutInput.alias] (which
may be null) and return ok: true.

---

Nitpick comments:
In `@docs/capabilities/README.md`:
- Around line 53-54: The capability IDs `issue.relations.blocked_by.add` and
`issue.relations.blocked_by.remove` use an underscore inside a segment which
breaks the dotted naming convention; update these to a fully dot-separated form
such as `issue.relations.blocking.add` and `issue.relations.blocking.remove`
(and update any references/tests) OR, if `blocked_by` is intentionally mirroring
existing external terminology, add a brief inline note to the naming-conventions
documentation explaining why `blocked_by` is allowed to prevent future drift;
ensure you update the README entries for
`issue.relations.blocked_by.add`/`remove` (or replace them with
`issue.relations.blocking.add`/`remove`) and any code/tests that reference those
capability IDs.

In `@packages/core/src/core/registry/types.ts`:
- Around line 11-85: ResolutionConfig currently inlines the lookup shape;
extract that inline object into a named interface (e.g., LookupSpec) and
reference it from ResolutionConfig to match docs and enable reuse. Define
interface LookupSpec { operationName: string; documentPath: string; vars:
Record<string,string> } and replace the inline lookup type in ResolutionConfig
with lookup: LookupSpec; update any references to ResolutionConfig if needed
(symbols: ResolutionConfig, LookupSpec, operationName, documentPath, vars).

In `@packages/core/src/core/routing/engine.ts`:
- Around line 360-381: The batch-failure paths in engine.ts currently hardcode
retryable: true; change both Phase 1 (catch block around mapErrorToCode) and the
Phase 2 batch-failure block to derive retryability from the mapped error code
instead of hardcoding it: call mapErrorToCode(err) once into a const (e.g.,
const code = mapErrorToCode(err)), compute retryable based on that code (e.g.,
treat AUTH and VALIDATION codes as non-retryable, or use a small helper
isRetryableErrorCode(code)), and set the error.retryable field to that computed
boolean for each result; update both the catch block that returns status
"failed" and the corresponding Phase 2 failure response to use the same logic.

In `@packages/core/test/unit/engine.test.ts`:
- Around line 512-627: Add an assertion verifying the GraphQL client was invoked
twice to assert the two-phase round-trip: after calling executeTasks in the
"executeTasks — mixed resolution chain" test, add
expect(queryMock).toHaveBeenCalledTimes(2); locate the queryMock variable in
this test (the mocked github client's query) and place the assertion alongside
the existing expectations that check buildBatchQueryMock/buildBatchMutationMock
and result contents.

aryeko and others added 4 commits February 20, 2026 03:19
… plan table

- Replace ResolutionBlock with ResolutionConfig in operation-cards.md diagram to match actual TypeScript type name
- Update stale capability IDs across docs (except docs/plans/): issue.labels.update → issue.labels.set and issue.assignees.update → issue.assignees.set
- Remove issue.comments.create row from 2026-02-20-atomic-chaining-followup.md (already batchable, shouldn't be in non-batchable candidates table)

Files modified:
- docs/architecture/operation-cards.md: ResolutionBlock → ResolutionConfig in diagram
- docs/benchmark/workflow-roadmap.md: capability ID updates
- docs/capabilities/issues.md: capability ID updates
- docs/getting-started/README.md: capability ID updates
- docs/getting-started/first-task.md: capability ID updates
- docs/guides/chaining-capabilities.md: capability ID updates
- docs/guides/cli-usage.md: capability ID updates
- docs/guides/library-api.md: capability ID updates
- docs/guides/result-envelope.md: capability ID updates
- docs/plans/2026-02-20-atomic-chaining-followup.md: removed issue.comments.create row

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… retryable from error code

Task 1: Add queryMock.toHaveBeenCalledTimes(2) assertion in mixed-resolution chain test
Task 7: Use 'in' operator to safely detect missing mutation result alias keys
Task 8: Add isRetryableCode() helper and derive retryable from error code in Phase 1 and Phase 2 catch blocks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous `[^}]*` regex for fragment extraction truncated bodies at
the first `}`, breaking nested selections like `labels { nodes { id } }`.
Switch to the same brace-depth counting approach already used for the
operation body extraction above it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/2026-02-20-atomic-chaining-followup.md`:
- Line 131: The current plan incorrectly suggests using a WeakMap for the
composite string/primitive cache key (operationName, variables), which will
throw a TypeError because WeakMap requires object keys; replace the
WeakMap-based design with a plain Map keyed by a string (e.g., serialize the
composite key) and store values as an object with { value: T, expiresAt: number
} (or a CacheEntry type), and implement explicit TTL eviction similar to the
existing CLI environment cache pattern so string-keyed lookups and time-based
eviction work correctly.

In `@docs/plans/2026-02-20-pr-review-fixes-round2.md`:
- Around line 350-362: The guard uses the `in` operator on `rawMutResult` which
will throw if `rawMutResult` is null/undefined; update the check in the mutation
result handling to first narrow `rawMutResult` (e.g., `rawMutResult == null` or
`typeof rawMutResult !== 'object'`) before using `in` or switch to a safe
property check such as `Object.prototype.hasOwnProperty.call(rawMutResult,
mutInput.alias)`; ensure the function that returns `{ task: req.task, ok: false,
error: { code: errorCodes.Unknown, ... } }` is still returned when the payload
is nullish or missing so callers of this path get a clean `ok: false` result
instead of a TypeError.
- Around line 166-174: The sed invocation using sed -i '' is macOS-only and will
fail on GNU sed; update the find/replace command so it's cross-platform: either
use a conditional that sets a SED_INPLACE variable based on uname (e.g. on
Darwin use -i '' and on Linux use -i), or use sed -i.bak (portable) and then
delete the .bak files afterward; locate the find ... -exec sed -i '' ... {} +
line in the plan and replace the sed invocation accordingly (or switch to perl
-pi -e for in-place edits) so the mass-replace works on both macOS and Linux CI.

In `@packages/core/src/cli/commands/chain.ts`:
- Line 5: The constant GITHUB_GRAPHQL_ENDPOINT is hardcoded in the chain command
and causes the custom fetch transport to ignore GITHUB_GRAPHQL_URL and GH_HOST;
remove the hardcoded GITHUB_GRAPHQL_ENDPOINT and instead obtain the GraphQL
endpoint via the existing transport/endpoint resolution used elsewhere (i.e.,
call or import the shared resolver or transport factory that considers
GITHUB_GRAPHQL_URL and GH_HOST), then pass that resolved endpoint into the
custom fetch logic in chain.ts (also update the same pattern in the block around
lines 73-88) so the CLI respects enterprise environment variables.

In `@packages/core/src/core/routing/engine.ts`:
- Around line 245-254: ChainResultEnvelope.meta.route_used is currently
hardcoded to the literal "graphql", but executeTask can route via CLI, REST, or
GraphQL; update the type and value to reflect the actual route: broaden
ChainResultEnvelope.meta.route_used from the literal to the RouteSource union
(or a string union that includes "cli"|"rest"|"graphql"), then change the return
in the 1-item path to forward the actual route value returned/used by
executeTask (e.g., use the route variable or result.route) instead of the
hardcoded "graphql"; if the literal was intentional, add a clear comment on
ChainResultEnvelope/engine.ts explaining why metadata may be inaccurate and
leave the type as-is.

---

Duplicate comments:
In `@docs/plans/2026-02-20-atomic-chaining-followup.md`:
- Around line 39-41: The benchmark scenario `issue-triage-atomic-wf-001`
references stale capability names `issue.labels.set` and `issue.assignees.set`;
update those references to the shipped names `issue.labels.update` and
`issue.assignees.update` (leave `issue.milestone.set` as-is if it is correct) so
the scenario matches the actual capability names used in code/tests.
- Around line 52-68: The chain/engine currently treats any GraphQL `errors[]` as
a hard failure, losing partial `data`; update the flow so GithubClient.query (or
add a new low-level method used by chain.ts/engine.ts) returns the raw `{ data,
errors }` instead of throwing, then in chain.ts (executeGraphqlRequest / chain
command) map each error.path back to the aliased step id (the step alias like
"step1" or "createIssue") and mark only those steps `ok: false` while preserving
successful step `data`; set ChainStatus to "partial" whenever at least one step
succeeded and at least one failed, and ensure the exported executeTasks surface
now conveys partial results rather than returning a full "failed" on any errors.

Comment on lines 245 to 254
return {
status: result.ok ? "success" : "failed",
results: [step],
meta: {
route_used: "graphql",
total: 1,
succeeded: result.ok ? 1 : 0,
failed: result.ok ? 0 : 1,
},
}
Copy link

@coderabbitai coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for ChainResultEnvelope interface definition
rg -n "interface ChainResultEnvelope|type ChainResultEnvelope" --type=ts -A 10

Repository: aryeko/ghx

Length of output: 724


🏁 Script executed:

#!/bin/bash
# Check the context around lines 245-254 in engine.ts
cat -n packages/core/src/core/routing/engine.ts | sed -n '235,260p'

Repository: aryeko/ghx

Length of output: 975


🏁 Script executed:

#!/bin/bash
# Find executeTask function to understand how it routes
rg -n "function executeTask|const executeTask|export.*executeTask" --type=ts -A 3 packages/core/src/core/routing/

Repository: aryeko/ghx

Length of output: 689


🏁 Script executed:

#!/bin/bash
# Check the full 1-item delegation logic to understand the routing context
rg -n "items.length.*1|if.*items.*length" --type=ts -B 5 -A 15 packages/core/src/core/routing/engine.ts

Repository: aryeko/ghx

Length of output: 36


🏁 Script executed:

#!/bin/bash
# Find ResultEnvelope type definition
rg -n "interface ResultEnvelope|type ResultEnvelope" --type=ts -A 10 packages/core/src/

Repository: aryeko/ghx

Length of output: 815


🏁 Script executed:

#!/bin/bash
# Get the full executeTask implementation to see what it returns
sed -n '112,220p' packages/core/src/core/routing/engine.ts | head -80

Repository: aryeko/ghx

Length of output: 2488


🏁 Script executed:

#!/bin/bash
# Check what result object contains in the 1-item path context
sed -n '220,255p' packages/core/src/core/routing/engine.ts

Repository: aryeko/ghx

Length of output: 1076


🏁 Script executed:

#!/bin/bash
# Find ResultMeta type definition to see if it has route_used
rg -n "type ResultMeta|interface ResultMeta" --type=ts -A 5 packages/core/src/

Repository: aryeko/ghx

Length of output: 493


🏁 Script executed:

#!/bin/bash
# Look at the full executeTask implementation to see what it returns
# Focus on the execute() call and routes
sed -n '112,180p' packages/core/src/core/routing/engine.ts

Repository: aryeko/ghx

Length of output: 2233


🏁 Script executed:

#!/bin/bash
# Check if there's documentation or comments about 1-item chain routing intent
rg -n "1-item|single.*item|delegate" --type=ts -B 3 -A 3 packages/core/src/core/routing/engine.ts | head -40

Repository: aryeko/ghx

Length of output: 343


🏁 Script executed:

#!/bin/bash
# Find RouteSource type definition
rg -n "type RouteSource|RouteSource =" --type=ts -A 3 packages/core/src/

Repository: aryeko/ghx

Length of output: 344


🏁 Script executed:

#!/bin/bash
# Check context around ChainResultEnvelope definition for any comments about routing
sed -n '45,70p' packages/core/src/core/contracts/envelope.ts

Repository: aryeko/ghx

Length of output: 413


🏁 Script executed:

#!/bin/bash
# Search for any comments or documentation about 1-item chain routing in engine.ts
sed -n '210,230p' packages/core/src/core/routing/engine.ts

Repository: aryeko/ghx

Length of output: 650


Type constraint prevents the proposed fix: ChainResultEnvelope.meta.route_used is a literal "graphql", not a union.

The 1-item path does delegate to executeTask, which can route via CLI, REST, or GraphQL. However, ChainResultEnvelope.meta.route_used is explicitly typed as the literal "graphql", making the proposed fix incompatible with the type signature.

Either the type should be broadened to RouteSource to reflect actual routing:

export interface ChainResultEnvelope {
  status: ChainStatus
  results: ChainStepResult[]
  meta: {
-   route_used: "graphql"
+   route_used: RouteSource
    total: number
    succeeded: number
    failed: number
  }
}

Then forward the actual route in the 1-item path:

return {
  status: result.ok ? "success" : "failed",
  results: [step],
  meta: {
-   route_used: "graphql",
+   route_used: result.meta.route_used ?? "graphql",
    total: 1,
    succeeded: result.ok ? 1 : 0,
    failed: result.ok ? 0 : 1,
  },
}

Or the 1-item path should enforce GraphQL routing. If the literal type is intentional, document why the metadata may be inaccurate for this path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/routing/engine.ts` around lines 245 - 254,
ChainResultEnvelope.meta.route_used is currently hardcoded to the literal
"graphql", but executeTask can route via CLI, REST, or GraphQL; update the type
and value to reflect the actual route: broaden
ChainResultEnvelope.meta.route_used from the literal to the RouteSource union
(or a string union that includes "cli"|"rest"|"graphql"), then change the return
in the 1-item path to forward the actual route value returned/used by
executeTask (e.g., use the route variable or result.route) instead of the
hardcoded "graphql"; if the literal was intentional, add a clear comment on
ChainResultEnvelope/engine.ts explaining why metadata may be inaccurate and
leave the type as-is.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aff0bd2. ChainResultEnvelope.meta.route_used is now typed as RouteSource (the "cli" | "rest" | "graphql" union). The 1-item delegation path forwards result.meta?.route_used ?? "graphql" so the metadata reflects the actual route used by executeTask.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@aryeko have exceeded the limit for the number of chat messages per hour. Please wait 1 minutes and 30 seconds before sending another message.

…velope type; null-guard in operator

- chain.ts: replace hardcoded GITHUB_GRAPHQL_ENDPOINT with resolveGraphqlUrl()
  from transport.ts so GITHUB_GRAPHQL_URL and GH_HOST are respected
- transport.ts: export resolveGraphqlUrl() for reuse
- envelope.ts: broaden ChainResultEnvelope.meta.route_used from "graphql" literal
  to RouteSource union (chains can route via cli/rest/graphql)
- engine.ts: forward result.meta.route_used in 1-item path; add null/non-object
  guard before `in` operator on rawMutResult
- docs: fix WeakMap → Map note in followup plan; fix sed -i '' → perl -pi cross-platform

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aryeko aryeko merged commit a38e812 into main Feb 20, 2026
5 of 6 checks passed
@aryeko aryeko deleted the feat/atomic-chaining branch February 20, 2026 01:37
@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 79.00302% with 139 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/core/src/core/routing/engine.ts 64.28% 94 Missing and 1 partial ⚠️
packages/core/src/cli/commands/chain.ts 75.42% 29 Missing ⚠️
packages/core/src/gql/resolve.ts 87.77% 11 Missing ⚠️
packages/core/src/cli/index.ts 50.00% 2 Missing ⚠️
packages/core/src/gql/batch.ts 97.33% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments