Skip to content

feat(core): capability namespace standardization#58

Merged
aryeko merged 13 commits intomainfrom
feat/benchmark-scenario-composite-update
Feb 20, 2026
Merged

feat(core): capability namespace standardization#58
aryeko merged 13 commits intomainfrom
feat/benchmark-scenario-composite-update

Conversation

@aryeko
Copy link
Owner

@aryeko aryeko commented Feb 19, 2026

Summary

  • Standardize all capability IDs to consistent naming conventions: view/list for reads, set/add/remove for mutations, dots-only plural sub-domains (threads, reviews, relations)
  • Rename 26 capabilities, add 7 new atomics, retire 5 (including all composite infrastructure)
  • Fix runId type mismatch across all 4 rerun capabilities (input and output now both integer)
  • Remove dead NON_JSON_STDOUT_CAPABILITIES export; add integration tests for all 7 new capabilities

Key changes

Renamed: pr.thread.*pr.threads.*, pr.review.*pr.reviews.*, pr.checks.rerun_failedpr.checks.rerun.failed, workflow.getworkflow.view, issue.labels.updateissue.labels.set, issue.relations.getissue.relations.view, issue.linked_prs.listissue.relations.prs.list, and more (see changeset)

New capabilities: issue.labels.remove, issue.assignees.add, issue.assignees.remove, issue.milestone.clear, pr.assignees.add, pr.assignees.remove, project_v2.items.issue.remove

Retired: pr.checks.failed, check_run.annotations.list, pr.assignees.update, pr.threads.composite, issue.triage.composite, issue.update.composite

Output schema fixes: All rerun capabilities return { runId: integer, queued: boolean } (normalized); pr.threads.reply now includes commentId/commentUrl; issue.relations.parent.set includes updated: boolean; issue.relations.blocked_by.add includes added: boolean; issue.milestone.set no longer accepts null (use issue.milestone.clear)

Test Plan

  • pnpm run ci --outputStyle=static passes (113 files, 888 tests, 95.36% coverage)
  • All renamed capabilities resolve correctly
  • All 7 new capabilities have integration tests
  • Changeset added for @ghx-dev/core minor bump

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added granular issue operations: assignee add/remove, label removal, milestone clearing.
    • Added PR assignee add/remove operations for finer control.
    • Enhanced PR thread operations with comment metadata in responses.
  • Changes

    • Standardized capability naming conventions for consistency.
    • Removed composite operations; replaced with focused atomic operations.
    • Improved issue relationship management capabilities.
    • Updated project management operations for clarity.

aryeko and others added 9 commits February 19, 2026 09:47
Documents the planned adjustments to workflow benchmark scenarios
following today's composite capabilities merge (#55):
- pr.threads.composite replaces atomic thread atomics in pr-fix-review-comments
- issue.triage.composite replaces atomic label+comment in issue-triage-comment
- ci-diagnose-run expanded to require workflow.job.logs.get

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4-task plan covering JSON edits to 3 workflow scenarios:
- pr-fix-review-comments: atomic threads → pr.threads.composite
- issue-triage-comment: atomic label+comment → issue.triage.composite
- ci-diagnose-run: expanded prompt + workflow.job.logs.get

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f-001

Reflects #55 composite capabilities merge — pr.threads.composite
replaces separate pr.thread.reply + pr.thread.resolve atomic calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f-001

Reflects #55 composite capabilities merge — issue.triage.composite
replaces separate issue.labels.update + issue.comments.create atomic calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…gnosis

Expand prompt to require fetching job logs and add workflow.job.logs.get
to expected_capabilities. Reflects workflow domain module extraction in #57.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…idance

- Reword pr-fix-review-comments-wf-001 prompt to focus on unresolved
  threads and combined reply+resolve action
- Restructure SKILL.md: move Discovery before Execute, add Composite
  subsection with CRITICAL enforcement referencing discovered caps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename all capability IDs to consistent conventions, remove composite
infrastructure, and add new atomic capabilities.

Naming principles applied:
- view = single-item read, list = collection (retire get)
- set = replace-all, add/remove = additive mutations
- Sub-domains use dots only, never underscores (rerun.failed not rerun_failed)
- Sub-domain names reflect feature area in plural (threads, reviews, checks, relations)

Renames:
- pr.thread.* → pr.threads.*, pr.review.* → pr.reviews.*
- pr.checks.rerun_all/failed → pr.checks.rerun.all/failed
- workflow.get → workflow.view, workflow.dispatch.run → workflow.dispatch
- workflow.run.rerun_all/failed → workflow.run.rerun.all/failed
- workflow.job.logs.get → workflow.job.logs.view
- project_v2.org/user.get → project_v2.org/user.view
- project_v2.item.* → project_v2.items.*
- release.get → release.view, release.create_draft → release.create, release.publish_draft → release.publish
- issue.relations.get → issue.relations.view, issue.linked_prs.list → issue.relations.prs.list
- issue.parent.* → issue.relations.parent.*, issue.blocked_by.* → issue.relations.blocked_by.*
- issue.labels/assignees.update → issue.labels/assignees.set

New capabilities:
- issue.labels.remove, issue.assignees.add/remove, issue.milestone.clear
- pr.assignees.add/remove (replaces retired pr.assignees.update)
- project_v2.items.issue.remove

Retired: pr.checks.failed (merged into pr.checks.list), check_run.annotations.list,
pr.assignees.update, pr.threads.composite, issue.triage.composite, issue.update.composite

Schema normalizations:
- All rerun ops: { runId: string, queued: boolean }
- pr.threads.reply: adds commentId + commentUrl to output
- issue.relations.parent.set: adds updated: boolean
- issue.relations.blocked_by.add: adds added: boolean
- issue.milestone split into set (integer only) + clear (separate capability)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove dead NON_JSON_STDOUT_CAPABILITIES export from helpers.ts (stale
  IDs, never imported anywhere)
- Fix runId type mismatch: all 4 rerun YAML cards now have integer output
  (was inconsistently string); handlers return integer directly without
  String() coercion; update test assertions accordingly
- Add integration tests for 7 new capabilities: issue.labels.remove,
  issue.assignees.add, issue.assignees.remove, issue.milestone.clear,
  pr.assignees.add, pr.assignees.remove, project_v2.items.issue.remove

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Feb 19, 2026

🦋 Changeset detected

Latest commit: 6560a6c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@ghx-dev/core Minor
@ghx-dev/benchmark Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@aryeko
Copy link
Owner Author

aryeko commented Feb 19, 2026

@codex review

@codecov
Copy link

codecov bot commented Feb 19, 2026

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82105da0c3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

output_schema:
type: object
required: [id, isResolved]
required: [id, isResolved, commentId, commentUrl]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep pr.threads.reply schema compatible with GraphQL response

This capability now requires commentId and commentUrl, but the GraphQL path still returns only { id, isResolved } (runReplyToReviewThread in packages/core/src/gql/domains/pr-mutations.ts, lines 234-237). Because execute validates successful route output against this schema, every pr.threads.reply request will be converted into an OUTPUT_VALIDATION error even when GitHub accepted the reply.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 9f151bf. Updated runReplyToReviewThread in pr-mutations.ts to return commentId and commentUrl (the comment object was already being fetched from the mutation result — just not included in the return value). The function now returns ReplyToReviewThreadData which extends ReviewThreadMutationData with the two new required fields.

output_schema:
type: object
required: [issueId, parentIssueId]
required: [issueId, parentIssueId, updated]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove new required flag or return it for parent-set

The output schema now requires updated, but the GraphQL implementation for this capability (runIssueParentSet in packages/core/src/gql/domains/issue-mutations.ts, lines 538-541) still returns only issueId and parentIssueId. As a result, successful mutations will fail output-schema validation at runtime and issue.relations.parent.set cannot complete successfully.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 9f151bf. Updated runIssueParentSet in issue-mutations.ts to return updated: true alongside issueId and parentIssueId. The GraphQL mutation already confirmed success before returning, so true is always accurate at that point.

output_schema:
type: object
required: [issueId, blockedByIssueId]
required: [issueId, blockedByIssueId, added]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove new required flag or return it for blocked-by add

This schema now requires an added field, but the GraphQL implementation (runIssueBlockedByAdd in packages/core/src/gql/domains/issue-mutations.ts, lines 598-601) still returns only issueId and blockedByIssueId. With output validation enabled in execute, issue.relations.blocked_by.add responses will fail validation even when the underlying mutation succeeds.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 9f151bf. Updated runIssueBlockedByAdd in issue-mutations.ts to return added: true. Mirrors the existing pattern used by runIssueBlockedByRemove which already returns removed: boolean.

Comment on lines 469 to 471
const projectId = parseNonEmptyString(params.projectId)
const itemId = parseNonEmptyString(params.itemId)
if (projectId === null || itemId === null) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Accept documented project_v2 remove inputs in CLI handler

The new project_v2.items.issue.remove card validates inputs as projectNumber + itemId, but this handler reads params.projectId and errors when it is absent. That means valid, schema-compliant requests will always fail in the CLI route, and callers cannot supply projectId instead because additionalProperties: false rejects it during input validation.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 9f151bf. Root cause: the YAML card was the wrong side of the mismatch. The gh project item-delete CLI command takes --project-id <nodeId> (a string node ID), not a numeric project number. Updated the YAML input_schema to use projectId: { type: string } matching what the handler already reads from params. Integration tests updated accordingly.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: My previous reply was wrong. The fix in 9f151bf introduced --project-id <nodeId> which is not a valid flag for gh project item-delete — that flag does not exist in the GitHub CLI.

The actual correct syntax is:

gh project item-delete <projectNumber> --owner <login> --id <itemId>

This was caught by CodeRabbit and properly fixed in ba90db1:

  • YAML input_schema updated to { owner: string, projectNumber: integer, itemId: string }
  • Handler rewritten to use ["project", "item-delete", String(projectNumber), "--owner", owner, "--id", itemId]
  • Integration and unit tests updated to use the new param shape

…90% patch coverage

- cli-domains-issue.test.ts: add describe blocks for handleIssueLabelsRemove,
  handleIssueAssigneesAdd, handleIssueAssigneesRemove, handleIssueMilestoneClear
  (success, invalid params, empty arrays, non-zero exit, runner throws paths)
- cli-domains-project-v2.test.ts: add describe block for project_v2.items.issue.remove
  (success, correct args, missing projectId/itemId, non-zero exit, runner throws)
- cli-domains-pr.test.ts: add runner-throws path for pr.assignees.add and pr.assignees.remove
- gql-builders.test.ts: cover replyBuilder.mapResponse() commentUrl truthy branch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

Warning

Rate limit exceeded

@aryeko has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 56 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Comprehensive capability ID standardization refactoring that renames identifiers across all domains (PR, issue, workflow, release, project-v2), pluralizes nested entities (pr.thread → pr.threads), consolidates actions (get → view, create_draft → create), splits combined operations (pr.assignees.update → add/remove), removes composite infrastructure entirely, and updates all related handlers, registry cards, adapters, and tests.

Changes

Cohort / File(s) Summary
Capability ID Standardization – Registry Cards
packages/core/src/core/registry/cards/issue.*.yaml, pr.*.yaml, workflow.*.yaml, release.*.yaml, project_v2.*.yaml
Renamed capability_id fields across 40+ YAML registry cards to reflect new naming conventions (e.g., pr.thread.* → pr.threads., pr.review. → pr.reviews.*, *.get → .view, release.create_draft → release.create, workflow.dispatch.run → workflow.dispatch). Updated input/output schemas for affected cards (e.g., pr.checks.rerun. now use runId/queued instead of prNumber/mode).
Composite Infrastructure Removal
packages/core/src/core/execute/composite.ts, src/core/registry/types.ts, src/core/registry/operation-card-schema.ts, src/core/routing/engine.ts
Deleted all composite capability support: removed expandCompositeSteps function, eliminated ExpandedOperation type, removed CompositeStep/CompositeConfig interfaces, deleted composite property from OperationCard and schema, and removed GraphQL composite execution pathway from routing engine.
Composite Registry Cards Removal
packages/core/src/core/registry/cards/issue.triage.composite.yaml, issue.update.composite.yaml, pr.threads.composite.yaml, check_run.annotations.list.yaml
Deleted four composite capability registry definitions and the check_run.annotations.list capability card entirely, removing multi-step orchestration configurations.
Issue Domain – New Handlers & Cards
packages/core/src/core/execution/adapters/cli/domains/issue.ts, packages/core/src/core/registry/cards/issue.assignees.*.yaml, issue.labels.*.yaml, issue.milestone.*.yaml
Implemented four new CLI handlers (handleIssueLabelsRemove, handleIssueAssigneesAdd, handleIssueAssigneesRemove, handleIssueMilestoneClear) with corresponding registry cards. Updated capability IDs (issue.labels.update → issue.labels.set, issue.assignees.update → issue.assignees.set/add/remove, added issue.milestone.clear).
Issue Relations – Card Renames
packages/core/src/core/registry/cards/issue.relations.*.yaml
Reorganized issue relation capability identifiers under issue.relations namespace (issue.parent.* → issue.relations.parent., issue.blocked_by. → issue.relations.blocked_by.*, issue.linked_prs.list → issue.relations.prs.list, issue.relations.get → issue.relations.view). Added updated/added flags to output schemas.
PR Domain – Handlers & Cards Update
packages/core/src/core/execution/adapters/cli/domains/pr.ts, packages/core/src/core/registry/cards/pr.*.yaml
Implemented pr.assignees.add/remove handlers, renamed pr.checks handlers (pr.checks.rerun_* → pr.checks.rerun.), updated pr.reviews. and pr.threads.* handlers/cards with new input/output shapes (commentId/commentUrl added to reply output). Removed pr.checks.failed card.
Workflow/Release Domain Updates
packages/core/src/core/execution/adapters/cli/domains/workflow.ts, release.ts, packages/core/src/core/registry/cards/workflow.*.yaml, release.*.yaml
Renamed workflow capabilities (workflow.get → workflow.view, workflow.job.logs.get → workflow.job.logs.view, workflow.dispatch.run → workflow.dispatch, workflow.run.rerun_* → workflow.run.rerun.*) and release capabilities (release.get → release.view, release.create_draft → release.create, release.publish_draft → release.publish). Updated handler mappings and CLI error messages accordingly.
Project v2 Domain Updates
packages/core/src/core/execution/adapters/cli/domains/project-v2.ts, packages/core/src/core/registry/cards/project_v2.*.yaml
Renamed project_v2 capabilities (get → view, item.add_issue → items.issue.add, item.field.update → items.field.update). Implemented new handleProjectV2ItemsIssueRemove handler for issue removal with corresponding registry card.
CLI Capability Adapter & Registry
packages/core/src/core/execution/adapters/cli-capability-adapter.ts, packages/core/src/core/registry/index.ts, src/core/execution/adapters/cli/helpers.ts
Updated CliCapabilityId union type to reflect all renamed/new capabilities across domains. Removed NON_JSON_STDOUT_CAPABILITIES constant. Reorganized preferredOrder list in registry to reflect new capability identifiers.
GraphQL Capability Registry & Types
packages/core/src/gql/capability-registry.ts, src/gql/builders.ts, src/gql/types.ts, src/gql/domains/pr-mutations.ts, issue-mutations.ts
Updated GraphQL builder operation keys to match renamed capabilities (pr.thread.* → pr.threads.*, issue.labels.update → issue.labels.set, etc.). Extended reply builder mapResponse to include commentId/commentUrl. Added new types ReplyToReviewThreadData, updated IssueParentSetData/IssueBlockedByData with updated/added flags.
Integration Tests – Capability ID Renames
packages/core/test/integration/engine-*.integration.test.ts
Updated 50+ integration tests to use new capability IDs in describe blocks and TaskRequest payloads (pr.threads., pr.reviews., pr.checks.rerun., pr.assignees., workflow., release., project_v2., issue.). Removed tests for removed capabilities (check_run.annotations.list, pr.assignees.update, pr.checks.failed).
Integration Tests – New Capabilities
packages/core/test/integration/engine-issue-assignees-*.test.ts, engine-issue-labels-remove.test.ts, engine-issue-milestone-clear.test.ts, engine-pr-assignees-*.test.ts, engine-project-v2-items-issue-remove.test.ts
Added new integration test files for newly implemented handlers, validating input validation and adapter error handling for issue/pr/project-v2 domain additions.
Unit Tests – Composite Removal
packages/core/test/unit/composite-*.test.ts, engine.test.ts
Deleted all composite-related unit tests (composite-engine.test.ts, composite-expand.test.ts, composite-types.test.ts) and removed composite test cases from engine.test.ts (224 lines removed).
Unit Tests – CLI & GraphQL Domains
packages/core/test/unit/cli-domains-*.test.ts, gql-*.test.ts
Updated CLI domain tests to reflect new handler keys, capability IDs, and data shapes across issue/pr/workflow/release/project-v2 domains. Updated GraphQL builder/registry tests to use renamed capability IDs. Comprehensive refactoring of test expectations and mock setups.
Unit Tests – Operation Cards & Registry
packages/core/test/unit/operation-cards.test.ts, capability-registry.test.ts, cli-capability-registry.test.ts
Updated registry validation tests to reflect new capability IDs in preferredOrder, removed composite capability references, adjusted test expectations for renamed domain operations. Updated ALL_CLI_CAPABILITY_IDS constant to include all renamed/new capabilities.
Documentation Updates
docs/capabilities/README.md, docs/capabilities/pull-requests.md, .changeset/capability-namespace-standardization.md, packages/core/README.md, packages/core/skills/using-ghx/SKILL.md
Updated capability documentation to reflect renamed IDs, updated capability counts (66 → 69), reorganized PR/issue/workflow domain documentation sections with new naming conventions. Updated ghx CLI guidance to adopt domain-first discovery approach.
Benchmark Scenarios
docs/plans/2026-02-19-benchmark-scenario-*.md, packages/benchmark/scenarios/workflows/*.json
Updated benchmark workflow scenario definitions to use new composite capability IDs (pr.threads.composite, issue.triage.composite) and add new workflow.job.logs.view capability. Updated expected_capabilities arrays and prompts in JSON scenario files to reflect standardized naming.
Configuration Updates
packages/core/package.json
Added new npm script "ghx" that runs "tsx src/cli/index.ts".

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

  • aryeko/ghx#55: Directly opposes this PR by adding composite capability support that this PR removes entirely; modifies the same composite.ts, registry types, operation-card-schema, and engine routing files.
  • aryeko/ghx#53: Related through overlapping GraphQL capability registry and handler mapping refactors; both modify gql/capability-registry.ts, builders.ts, and capability dispatch logic.
  • aryeko/ghx#57: Related through CLI capability adapter surface updates and domain handler mappings; both modify adapters/cli/domains/* and CliCapabilityId registry entries.

Suggested labels

refactor, breaking-change, codex


🐰 The Rabbit's Tale

With threads now plural and composites gone,
The ghx capabilities march ever on.
From get to view, from draft to create—
Standardization's shaped this grand restate! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: standardizing capability namespace conventions across the codebase.
Description check ✅ Passed The PR description comprehensively covers the scope, key changes, and validation with detailed sections on renamed, new, and retired capabilities, plus test plan confirmation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/benchmark-scenario-composite-update

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Four P1 issues identified by Codex review:

1. pr.threads.reply: runReplyToReviewThread now returns commentId/commentUrl
   alongside id/isResolved, matching the output_schema. Adds
   ReplyToReviewThreadData type extending ReviewThreadMutationData.

2. issue.relations.parent.set: runIssueParentSet now returns updated: true,
   satisfying the required output field. IssueParentSetData type updated.

3. issue.relations.blocked_by.add: runIssueBlockedByAdd now returns added: true,
   satisfying the required output field. IssueBlockedByData type updated.

4. project_v2.items.issue.remove: YAML input schema corrected to require
   projectId (node ID string) instead of projectNumber, matching the CLI
   handler which calls gh project item-delete --project-id. Integration
   tests updated accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai bot previously requested changes Feb 19, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (12)
packages/core/test/integration/engine-project-v2-user-get.integration.test.ts (1)

1-29: ⚠️ Potential issue | 🟡 Minor

File name should be updated to match the renamed capability.

The file is still named engine-project-v2-user-get.integration.test.ts, but the capability was renamed from project_v2.user.getproject_v2.user.view. The AI summary notes that the analogous engine-project-v2-org-get.integration.test.ts was already updated as part of this PR. This file should be renamed to engine-project-v2-user-view.integration.test.ts for consistency.

📝 Rename the file
git mv packages/core/test/integration/engine-project-v2-user-get.integration.test.ts \
        packages/core/test/integration/engine-project-v2-user-view.integration.test.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-project-v2-user-get.integration.test.ts`
around lines 1 - 29, Rename the test file named
engine-project-v2-user-get.integration.test.ts to
engine-project-v2-user-view.integration.test.ts so the filename matches the
updated capability; update any references/imports if present, and ensure the
test suite string "executeTask project_v2.user.view" (inside the describe)
remains unchanged.
packages/core/test/integration/engine-project-v2-org-get.integration.test.ts (1)

1-29: ⚠️ Potential issue | 🟡 Minor

Rename file to match the updated capability (getview).

The file is still named engine-project-v2-org-get.integration.test.ts while its content — the describe label (Line 6) and task value (Line 15) — both reference the renamed capability project_v2.org.view. All sibling integration test files follow the convention engine-{capability-name-hyphenated}.integration.test.ts (e.g., engine-issue-assignees-add.integration.test.ts), so this file should be renamed to engine-project-v2-org-view.integration.test.ts for consistency. As per coding guidelines, kebab-case file names should accurately reflect their content.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/integration/engine-project-v2-org-get.integration.test.ts`
around lines 1 - 29, Rename the test file from
engine-project-v2-org-get.integration.test.ts to
engine-project-v2-org-view.integration.test.ts so the filename matches the
capability referenced in the test (describe("executeTask project_v2.org.view")
and the TaskRequest task value "project_v2.org.view"); ensure the new kebab-case
filename is used in the repo and any test runner config so the test remains
discoverable.
packages/core/test/unit/operation-cards.test.ts (1)

102-126: ⚠️ Potential issue | 🟡 Minor

Add project_v2.items.issue.remove to the CLI-preferred assertion block.

The test at lines 102–126 validates that all project_v2.* cards are CLI-preferred with no fallbacks, but project_v2.items.issue.remove (listed at line 70 in the registry test) is missing from this assertion block. The card's routing configuration confirms it is CLI-preferred with empty fallbacks, matching the pattern of other project_v2 cards being tested.

Proposed fix
     const projectItemAddIssue = getOperationCard("project_v2.items.issue.add")
+    const projectItemRemoveIssue = getOperationCard("project_v2.items.issue.remove")
     const projectItemFieldUpdate = getOperationCard("project_v2.items.field.update")
     const issueTypes = getOperationCard("repo.issue_types.list")
 
     expect(projectOrg?.routing.preferred).toBe("cli")
     expect(projectUser?.routing.preferred).toBe("cli")
     expect(projectFields?.routing.preferred).toBe("cli")
     expect(projectItems?.routing.preferred).toBe("cli")
     expect(projectItemAddIssue?.routing.preferred).toBe("cli")
+    expect(projectItemRemoveIssue?.routing.preferred).toBe("cli")
     expect(projectItemFieldUpdate?.routing.preferred).toBe("cli")
     expect(issueTypes?.routing.preferred).toBe("cli")
 
     expect(projectOrg?.routing.fallbacks).toEqual([])
     expect(projectUser?.routing.fallbacks).toEqual([])
     expect(projectFields?.routing.fallbacks).toEqual([])
     expect(projectItems?.routing.fallbacks).toEqual([])
     expect(projectItemAddIssue?.routing.fallbacks).toEqual([])
+    expect(projectItemRemoveIssue?.routing.fallbacks).toEqual([])
     expect(projectItemFieldUpdate?.routing.fallbacks).toEqual([])
     expect(issueTypes?.routing.fallbacks).toEqual([])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/operation-cards.test.ts` around lines 102 - 126, The
test "marks Projects v2 and repo issue types as CLI-preferred with no fallbacks"
is missing an assertion for project_v2.items.issue.remove; update the block that
uses getOperationCard to include const projectItemRemoveIssue =
getOperationCard("project_v2.items.issue.remove") and add assertions
expect(projectItemRemoveIssue?.routing.preferred).toBe("cli") and
expect(projectItemRemoveIssue?.routing.fallbacks).toEqual([]) so the remove
action is validated alongside project_v2.items.issue.add and others.
packages/core/test/unit/github-client.test.ts (1)

1078-1131: ⚠️ Potential issue | 🟠 Major

Update replyToReviewThread implementation and test to include commentId and commentUrl output fields.

The capability card pr.threads.reply requires output fields [id, isResolved, commentId, commentUrl], but the implementation in packages/core/src/gql/domains/pr-mutations.ts only returns { id, isResolved }.

The test mock at line 1082–1084 returns { comment: { id: "comment-1" } } without a url field, and the assertion at lines 1120–1123 expects only { id: "thread-1", isResolved: true }, missing commentId and commentUrl.

The replyBuilder in builders.ts shows the correct shape: extract commentId from comment.id and commentUrl from comment.url (with empty string fallback). Update:

  1. runReplyToReviewThread to extract and return these fields from the comment object already available at line 221.
  2. Extend ReviewThreadMutationData type to include commentId: string and commentUrl: string.
  3. Update the test mock to include url: "http://..." on the comment and update the assertion to expect all four fields.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/github-client.test.ts` around lines 1078 - 1131, The
reply flow currently omits commentId/commentUrl: update the
runReplyToReviewThread function to extract comment.id and comment.url (falling
back to empty string for url) from the GraphQL response and include them in the
returned payload, update the ReviewThreadMutationData type to add commentId:
string and commentUrl: string, and adjust the unit test mock in
github-client.test.ts so the mocked addPullRequestReviewThreadReply returns
comment: { id: "comment-1", url: "http://..." } and the expectation asserts {
id, isResolved, commentId: "comment-1", commentUrl: "http://..." } to match the
replyBuilder shape.
packages/core/src/core/registry/cards/issue.milestone.set.yaml (1)

3-3: ⚠️ Potential issue | 🟡 Minor

Stale description contradicts the schema change — "or clear with null" is no longer valid.

The description says "Set issue milestone number or clear with null." but the input schema now rejects null (type changed to integer, minimum 1). Consumers reading the card description will incorrectly believe null is still accepted. The PR deliberately split clearing into issue.milestone.clear.

✏️ Proposed fix
-description: Set issue milestone number or clear with null.
+description: Set issue milestone by milestone number. Use issue.milestone.clear to remove a milestone.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/issue.milestone.set.yaml` at line 3,
Update the card description for issue.milestone.set to remove the phrase "or
clear with null" since the input schema now requires an integer >=1; make the
description state only that it sets the issue milestone number (and, if helpful,
reference that clearing is handled by issue.milestone.clear) so consumers are
not misled about accepting null.
packages/core/test/integration/engine-issue-relations-get.integration.test.ts (1)

1-33: ⚠️ Potential issue | 🟡 Minor

File name should be updated to match the renamed capability

The file is still named engine-issue-relations-get.integration.test.ts while the capability was renamed to issue.relations.view. The test body (lines 6 and 15) correctly uses the new name. The file name should be updated to engine-issue-relations-view.integration.test.ts for consistency and discoverability. As per coding guidelines, kebab-case file names should reflect the capability they test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-issue-relations-get.integration.test.ts`
around lines 1 - 33, The test file name is inconsistent with the renamed
capability: rename the test file from
engine-issue-relations-get.integration.test.ts to
engine-issue-relations-view.integration.test.ts so it matches the capability
used in the test (see describe("executeTask issue.relations.view") and the
TaskRequest.task "issue.relations.view"); update any test runner references if
present and ensure the new kebab-case filename is used in the repository so
discoverability and naming conventions are consistent.
packages/core/README.md (1)

155-172: ⚠️ Potential issue | 🟠 Major

README lists stale/deprecated capability names that this PR renames.

The PR read and execute sections (lines 161–163) were properly updated, but several other sections still reference old names that this PR explicitly renames or deprecates:

Issues (line 159): issue.labels.updateissue.labels.set, issue.assignees.update → should reflect the new issue.assignees.add/issue.assignees.remove/issue.assignees.set split, issue.relations.getissue.relations.view, issue.parent.setissue.relations.parent.set, issue.parent.removeissue.relations.parent.remove, issue.blocked_by.addissue.relations.blocked_by.add, issue.blocked_by.removeissue.relations.blocked_by.remove. Also missing new capabilities: issue.labels.remove, issue.milestone.clear.

CI Diagnostics (line 165): workflow.job.logs.getworkflow.job.logs.view.

Releases (line 167): release.getrelease.view per the PR objectives.

Workflow Controls (line 169): Uses deprecated underscore forms (workflow_dispatch.run, workflow_run.view, workflow_run.rerun_failed, workflow_run.rerun_all, workflow_run.cancel, workflow_run.artifacts.list, workflow.get). These should be the dotted forms (workflow.dispatch, workflow.run.view, workflow.run.rerun.failed, workflow.run.rerun.all, workflow.run.cancel, workflow.run.artifacts.list, workflow.view).

This README is the primary public-facing capability reference for users — having it list old names defeats the purpose of the namespace standardization and will break any consumer who copies these IDs. Based on learnings, dotted capability names (domain.resource.action) should be used consistently in both docs and code, and deprecated underscore forms should not be used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/README.md` around lines 155 - 172, The README's capability
lists still contain deprecated/old names; update the capability identifiers to
the new dotted namespace forms and add missing new capabilities: replace
issue.labels.update with issue.labels.set and add issue.labels.remove; replace
issue.assignees.update with the new split (issue.assignees.add,
issue.assignees.remove, issue.assignees.set); change issue.relations.get to
issue.relations.view and rename issue.parent.set/remove to
issue.relations.parent.set and issue.relations.parent.remove; rename
issue.blocked_by.add/remove to issue.relations.blocked_by.add and
issue.relations.blocked_by.remove and add issue.milestone.clear; change
workflow.job.logs.get to workflow.job.logs.view; change release.get to
release.view; and convert all underscore workflow entries
(workflow_dispatch.run, workflow_run.view, workflow_run.rerun_failed,
workflow_run.rerun_all, workflow_run.cancel, workflow_run.artifacts.list,
workflow.get) to dotted forms (workflow.dispatch, workflow.run.view,
workflow.run.rerun.failed, workflow.run.rerun.all, workflow.run.cancel,
workflow.run.artifacts.list, workflow.view) so the README matches the PR's
namespace standardization.
packages/core/test/unit/cli-domains-release.test.ts (1)

417-417: ⚠️ Potential issue | 🟡 Minor

Stale test descriptions still reference old capability name release.get.

Two it() descriptions were not updated with the rest of the renames:

  • Line 417: "release.get returns error for missing name" → should be "release.view returns error for missing name"
  • Line 469: "release.get returns error on malformed JSON" → should be "release.view returns error on malformed JSON"
✏️ Proposed fix
-    it("release.get returns error for missing name", async () => {
+    it("release.view returns error for missing name", async () => {
-    it("release.get returns error on malformed JSON", async () => {
+    it("release.view returns error on malformed JSON", async () => {

Also applies to: 469-469

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-release.test.ts` at line 417, Update the
two stale test descriptions that still say "release.get" to the new capability
name "release.view": change the it() string "release.get returns error for
missing name" to "release.view returns error for missing name" and change
"release.get returns error on malformed JSON" to "release.view returns error on
malformed JSON" in the test file (look for the it(...) blocks containing those
exact strings in cli-domains-release.test.ts).
packages/core/skills/using-ghx/SKILL.md (1)

2-2: ⚠️ Potential issue | 🟡 Minor

Update capability count from 69 to 70 in the description.

The registry currently contains 70 capabilities (verified by counting YAML cards in packages/core/src/core/registry/cards/), but the frontmatter lists "69 capabilities".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/skills/using-ghx/SKILL.md` at line 2, Update the frontmatter
description in SKILL.md to reflect the correct capability count: change "69
capabilities" to "70 capabilities" in the description field so it matches the
registry (the description frontmatter line that currently reads 'description:
Execute GitHub operations via ghx — deterministic routing, normalized output, 69
capabilities').
docs/capabilities/README.md (1)

35-57: ⚠️ Potential issue | 🟠 Major

Summary Table is severely out of sync with the registry — multiple domains need capability ID updates and Mermaid counts corrected.

The PR updated PR rows but left four other domains with stale/retired capability IDs. Current vs. registry:

Issues (docs: 21, registry: 23):

  • Replace issue.labels.update with issue.labels.set
  • Replace issue.assignees.update with issue.assignees.set
  • Replace issue.parent.set / issue.parent.remove with issue.relations.parent.set / issue.relations.parent.remove
  • Replace issue.blocked_by.add / issue.blocked_by.remove with issue.relations.blocked_by.add / issue.relations.blocked_by.remove
  • Replace issue.linked_prs.list with issue.relations.prs.list
  • Replace issue.relations.get with issue.relations.view
  • Remove issue.triage.composite (retired)
  • Remove issue.update.composite (retired)

Releases (docs: 5, registry: 5, but IDs stale):

  • release.create_draftrelease.create
  • release.getrelease.view
  • release.publish_draftrelease.publish

Workflows (docs: 11, registry: 11, but IDs stale):

  • workflow.getworkflow.view
  • workflow.dispatch.runworkflow.dispatch
  • workflow.run.rerun_allworkflow.run.rerun.all
  • workflow.run.rerun_failedworkflow.run.rerun.failed
  • workflow.job.logs.getworkflow.job.logs.view

Projects V2 (docs: 6, registry: 7):

  • Replace project_v2.org.get with project_v2.org.view
  • Replace project_v2.user.get with project_v2.user.view
  • Replace project_v2.item.add_issue with project_v2.items.issue.add
  • Replace project_v2.item.field.update with project_v2.items.field.update
  • Add project_v2.items.issue.remove (missing)

Check Runs (docs: 1, registry: 0):

  • Remove entire Check Runs section and Mermaid node (no capabilities exist)

Update Mermaid counts accordingly: Issues (23), PRs (21), Projects V2 (7), remove Check Runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/README.md` around lines 35 - 57, The summary table in
README.md is out of sync: update the Issues rows by renaming capability IDs
(replace issue.labels.update → issue.labels.set, issue.assignees.update →
issue.assignees.set, issue.parent.set/remove →
issue.relations.parent.set/remove, issue.blocked_by.add/remove →
issue.relations.blocked_by.add/remove, issue.linked_prs.list →
issue.relations.prs.list, issue.relations.get → issue.relations.view), remove
retired entries issue.triage.composite and issue.update.composite, update
Releases IDs (release.create_draft → release.create, release.get → release.view,
release.publish_draft → release.publish), update Workflows IDs (workflow.get →
workflow.view, workflow.dispatch.run → workflow.dispatch, workflow.run.rerun_all
→ workflow.run.rerun.all, workflow.run.rerun_failed → workflow.run.rerun.failed,
workflow.job.logs.get → workflow.job.logs.view), fix Projects V2 rows
(project_v2.org.get → project_v2.org.view, project_v2.user.get →
project_v2.user.view, project_v2.item.add_issue → project_v2.items.issue.add,
project_v2.item.field.update → project_v2.items.field.update and add
project_v2.items.issue.remove), remove the Check Runs section and corresponding
Mermaid node, and update the Mermaid counts to Issues: 23, Pull Requests: 21,
Projects V2: 7 (and adjust any other diagram counts to match the registry).
docs/capabilities/pull-requests.md (1)

361-367: ⚠️ Potential issue | 🟡 Minor

pr.threads.reply output table does not match the implementation.

The doc lists {id, isResolved} but replyBuilder.mapResponse (in packages/core/src/gql/builders.ts, lines 63–73) actually returns {id, commentId, commentUrl}. This will mislead consumers.

Proposed fix
 | Field | Type | Description |
 |-------|------|-------------|
 | id | string | Thread ID |
-| isResolved | boolean | Current resolution state |
+| commentId | string | Created comment ID |
+| commentUrl | string | URL of the created comment |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/pull-requests.md` around lines 361 - 367, The documentation
for pr.threads.reply is out of sync with the implementation: update the Output
table to match replyBuilder.mapResponse's actual return shape ({id, commentId,
commentUrl}) instead of {id, isResolved}; locate the doc entry for
pr.threads.reply and replace the fields/types/descriptions accordingly so
consumers see id (string), commentId (string) and commentUrl (string) as
returned by replyBuilder.mapResponse.
packages/core/src/core/execution/adapters/cli/domains/pr.ts (1)

459-459: ⚠️ Potential issue | 🟡 Minor

Inconsistent capability name in error message: pr.review.submit (singular) vs. pr.reviews.submit (plural).

All other references in handlePrReviewSubmit use "pr.reviews.submit", but this thrown error still uses the old singular form.

Proposed fix
-      if (body === null) throw new Error("Missing or invalid body for pr.review.submit")
+      if (body === null) throw new Error("Missing or invalid body for pr.reviews.submit")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts` at line 459, The
thrown Error message uses the outdated singular capability name
"pr.review.submit"; update the message to the correct plural "pr.reviews.submit"
inside the handlePrReviewSubmit flow so the thrown error and all other checks
consistently reference "pr.reviews.submit" (locate the throw new Error("Missing
or invalid body for pr.review.submit") in handlePrReviewSubmit and change its
message to use "pr.reviews.submit").
🧹 Nitpick comments (19)
packages/core/test/integration/engine-pr-checks-rerun-failed.integration.test.ts (1)

6-33: LGTM — rename is correct and test isolation is sound.

The pr.checks.rerun_failedpr.checks.rerun.failed rename is consistently applied across the describe label (Line 6) and the task field (Line 15). The stub client and disabled CLI flags properly isolate this test from external systems.

One optional gap: the test only exercises the validation-error path. The PR description notes that the normalized output shape for rerun capabilities is now { runId: integer, queued: boolean }. A success-path case with a valid prNumber and a mock returning a typed response would cover that schema contract as well.

✅ Optional: add a success-path assertion for the normalized output shape
 describe("executeTask pr.checks.rerun.failed", () => {
   it("returns validation error envelope for invalid prNumber", async () => {
     // ... existing test unchanged
   })
+
+  it("returns normalized runId integer and queued boolean on success", async () => {
+    const githubClient = createGithubClient({
+      async execute<TData>(): Promise<TData> {
+        return {} as TData
+      },
+    })
+
+    const request: TaskRequest = {
+      task: "pr.checks.rerun.failed",
+      input: {
+        owner: "go-modkit",
+        name: "modkit",
+        prNumber: 1,
+      },
+    }
+
+    const result = await executeTask(request, {
+      githubClient,
+      githubToken: "test-token",
+      ghCliAvailable: true,
+      ghAuthenticated: true,
+    })
+
+    if (result.ok) {
+      expect(typeof result.data.runId).toBe("number")
+      expect(typeof result.data.queued).toBe("boolean")
+    }
+  })
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-pr-checks-rerun-failed.integration.test.ts`
around lines 6 - 33, Add a success-path integration test for the rerun endpoint:
create a test similar to the existing one in the describe "executeTask
pr.checks.rerun.failed" that calls executeTask with a valid prNumber (e.g., 123)
and a githubClient stub whose execute<TData>() returns an object matching the
normalized rerun shape { runId: number, queued: boolean }; assert result.ok is
true and verify result.data (or result.value) contains runId as a number and
queued as a boolean to cover the new schema contract for pr.checks.rerun.failed.
packages/core/test/integration/engine-pr-assignees-remove.integration.test.ts (1)

14-53: Add test coverage for empty assignees array.

The pr.assignees.remove schema enforces minItems: 1 on the assignees field, but the current tests do not cover this constraint. The two existing tests validate missing prNumber and missing assignees, leaving an empty array case untested. This gap should be addressed:

Suggested test case
+  it("returns validation error envelope for empty assignees", async () => {
+    const githubClient = createGithubClient({
+      async execute<TData>(): Promise<TData> {
+        return {} as TData
+      },
+    })
+
+    const request: TaskRequest = {
+      task: "pr.assignees.remove",
+      input: { owner: "acme", name: "modkit", prNumber: 10, assignees: [] },
+    }
+
+    const result = await executeTask(request, {
+      githubClient,
+      githubToken: "test-token",
+      ghCliAvailable: false,
+      ghAuthenticated: false,
+    })
+
+    expect(result.ok).toBe(false)
+    expect(result.error?.code).toBe("VALIDATION")
+    expect(result.meta.reason).toBe("INPUT_VALIDATION")
+  })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-pr-assignees-remove.integration.test.ts`
around lines 14 - 53, Add a new integration test for the "pr.assignees.remove"
task that submits an input with owner/name/prNumber present but assignees: [] to
assert schema minItems:1 triggers a validation error; reuse the existing pattern
(createGithubClient stub, build a TaskRequest for "pr.assignees.remove", call
executeTask with the githubClient and test token, then assert result.ok is
false, result.error?.code === "VALIDATION" and result.meta.reason ===
"INPUT_VALIDATION") so the empty-array case is covered alongside the existing
missing-field tests.
packages/core/src/core/registry/cards/pr.assignees.remove.yaml (1)

21-23: removed items missing minLength: 1 constraint (inconsistency with input).

Input assignees.items enforces minLength: 1; the output removed.items has no such constraint. GitHub never returns empty-string logins, so this is purely a schema consistency nit.

♻️ Proposed fix
     removed:
       type: array
-      items: { type: string }
+      items: { type: string, minLength: 1 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/pr.assignees.remove.yaml` around lines
21 - 23, The output schema for the "removed" array is missing the minLength
constraint on its items; update the YAML so removed.items includes minLength: 1
to match the input schema (assignees.items) and ensure consistency between
assignees.items and removed.items.
packages/core/src/core/registry/cards/project_v2.org.view.yaml (1)

3-3: Description uses "Get" while the capability ID now uses "view".

Minor wording inconsistency after the rename.

🔧 Proposed fix
-description: Get an organization Projects v2 project.
+description: View an organization Projects v2 project.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/project_v2.org.view.yaml` at line 3,
The YAML description text is inconsistent with the capability ID (uses "view");
update the description field in project_v2.org.view.yaml to match by replacing
"Get an organization Projects v2 project." with wording that uses "View" (e.g.,
"View an organization Projects v2 project.") so the description aligns with the
capability ID and intent.
packages/core/src/core/registry/cards/pr.threads.reply.yaml (1)

17-18: commentId and commentUrl are missing minLength: 1 — inconsistent with id.

The required id field enforces minLength: 1, but the two new required fields do not. Both fields will always be non-empty strings from GitHub's GraphQL response, so adding the constraint would make validation tighter and consistent.

🔧 Proposed fix
-    commentId: { type: string }
-    commentUrl: { type: string }
+    commentId: { type: string, minLength: 1 }
+    commentUrl: { type: string, minLength: 1 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/pr.threads.reply.yaml` around lines 17
- 18, The schema adds required fields commentId and commentUrl but they lack the
minLength constraint used by id; update the YAML for the card (fields commentId
and commentUrl) to include minLength: 1 (matching id) so both are validated as
non-empty strings — locate the definitions for commentId and commentUrl in the
pr.threads.reply.yaml and add the same minLength: 1 constraint used by the id
field.
packages/core/src/core/registry/cards/release.view.yaml (1)

1-3: Stale description — "Get" should be updated to match the view rename.

✏️ Proposed fix
 capability_id: release.view
 version: "1.0.0"
-description: Get release details by tag name.
+description: View release details by tag name.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/release.view.yaml` around lines 1 - 3,
The description field in the release.view capability is stale: replace the
present-tense "Get release details by tag name." with wording that reflects the
renamed operation (e.g., "View release details by tag name." or similar). Update
the description value in the YAML for capability_id release.view so it uses
"View" (or another appropriate verb matching the view rename) while keeping the
rest of the text intact.
packages/core/test/integration/engine-release-publish-draft.integration.test.ts (1)

1-33: File name retains the retired publish-draft suffix — consider renaming to match the new capability.

The capability was intentionally renamed from release.publish_draft to release.publish, but the test file is still engine-release-publish-draft.integration.test.ts. Developers searching for release.publish tests won't naturally find this file. Renaming to engine-release-publish.integration.test.ts removes the stale reference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-release-publish-draft.integration.test.ts`
around lines 1 - 33, The test file still uses the retired "publish-draft"
suffix; rename the test file to match the capability rename (e.g., change
engine-release-publish-draft.integration.test.ts to
engine-release-publish.integration.test.ts) and update any references/imports to
the file if present; verify the test's describe string ("executeTask
release.publish") and the TaskRequest.task value ("release.publish") remain
consistent after renaming.
packages/core/test/integration/engine-pr-review-submit-approve.integration.test.ts (1)

6-61: Test content LGTM — capability ID strings correctly updated.

Optional: consider renaming the file from engine-pr-review-submit-approve.integration.test.tsengine-pr-reviews-submit-approve.integration.test.ts to match the pluralized capability surface (pr.reviews.*), keeping it consistent with the broader rename pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-pr-review-submit-approve.integration.test.ts`
around lines 6 - 61, Rename the test file to use the pluralized capability name
(engine-pr-reviews-submit-approve.integration.test.ts) to match the pr.reviews.*
surface; update any references/imports or test-runner globs that point to the
old filename so CI picks up the test, and verify the describe/it strings (e.g.,
"executeTask pr.reviews.submit") remain unchanged.
packages/core/test/integration/engine-release-create-draft.integration.test.ts (1)

6-7: File name engine-release-create-draft.integration.test.ts is now misleading.

"draft" in the file name directly contradicts the renamed capability (release.create, not release.create_draft). Rename to engine-release-create.integration.test.ts to keep the test suite navigable. This is the clearest file-naming mismatch in the set of renames.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-release-create-draft.integration.test.ts`
around lines 6 - 7, The test filename is misleading because it references
"draft" while the test suite uses the capability name "release.create"; rename
the test file from engine-release-create-draft.integration.test.ts to
engine-release-create.integration.test.ts and update any imports or test-runner
references accordingly so the describe("executeTask release.create", ...) and
its it("returns cli envelope for release.create", ...) remain intact but are
stored under the new filename.
packages/core/test/integration/engine-project-v2-item-add-issue.integration.test.ts (1)

6-15: Consider renaming the test file to match the new capability surface.

The file name engine-project-v2-item-add-issue.integration.test.ts still reflects the old project_v2.item.add_issue form; engine-project-v2-items-issue-add.integration.test.ts would be consistent with the pluralized project_v2.items.issue.add rename.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-project-v2-item-add-issue.integration.test.ts`
around lines 6 - 15, Rename the test file to match the new capability surface by
changing the filename from engine-project-v2-item-add-issue.integration.test.ts
to engine-project-v2-items-issue-add.integration.test.ts, and update any
references (imports, test runner globs, or CI config) that reference the old
name; also ensure the describe/test strings (e.g., "executeTask
project_v2.items.issue.add") remain accurate after the rename so the test name
and filename are consistent.
packages/core/src/core/registry/cards/issue.assignees.set.yaml (1)

25-27: Optional: stale IssueAssigneesUpdate operationName / document path after update → set rename.

The GQL internals still reference the old "update" naming. Since the underlying operation is unchanged this has no functional impact, but it creates terminology drift between the public capability ID (issue.assignees.set) and the implementation layer. Worth aligning in a follow-up cleanup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/issue.assignees.set.yaml` around lines
25 - 27, The YAML still references the old GraphQL names: change the
graphql.operationName value from "IssueAssigneesUpdate" to match the capability
rename (e.g., "IssueAssigneesSet") and update graphql.documentPath from
"src/gql/operations/issue-assignees-update.graphql" to the corresponding file
name (e.g., "src/gql/operations/issue-assignees-set.graphql"); ensure the new
names match the actual operation name and file created for the renamed
capability (referencing IssueAssigneesUpdate, IssueAssigneesSet and documentPath
to locate the entries).
packages/core/src/core/registry/cards/project_v2.user.view.yaml (1)

3-3: Stale description — "Get" should be "View".

After the get → view rename the description still reads "Get a user Projects v2 project." For consistency with other view capabilities, update it.

📝 Proposed fix
-description: Get a user Projects v2 project.
+description: View a user Projects v2 project.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/project_v2.user.view.yaml` at line 3,
Update the YAML description for the capability to match the rename from "get" to
"view": locate the description key in project_v2.user.view.yaml
(capability/project name project_v2.user.view) and change the text from "Get a
user Projects v2 project." to "View a user Projects v2 project." so it is
consistent with other view capabilities.
packages/benchmark/scenarios/workflows/ci-diagnose-run-wf-001.json (1)

18-27: No checkpoint for the newly added workflow.job.logs capability

expected_capabilities now includes workflow.job.logs.get (or .view), but assertions.checkpoints only verifies workflow.run.view. If log retrieval correctness matters for this scenario's pass/fail, a corresponding checkpoint is missing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/benchmark/scenarios/workflows/ci-diagnose-run-wf-001.json` around
lines 18 - 27, The assertions currently only verify "run_viewed" via
assertions.checkpoints but the new capability expected_capabilities includes
workflow.job.logs.get (or .view), so add a new checkpoint to
assertions.checkpoints that verifies log retrieval; create a checkpoint (e.g.,
name "job_logs_retrieved") with verification_task set to the appropriate log
verification (e.g., "workflow.job.logs.get" or "workflow.job.logs.view"), an
empty verification_input as needed, and a condition such as "non_empty" to
ensure logs are present.
packages/core/src/core/registry/cards/issue.relations.view.yaml (2)

3-3: Nit: description still says "Get" after rename to .view

description: Get issue parent/children/blocking relations. — consider updating to "View issue…" to be consistent with the .view convention adopted by this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/issue.relations.view.yaml` at line 3,
Update the YAML `description` field to reflect the `.view` naming convention:
change the string currently reading "Get issue parent/children/blocking
relations." to "View issue parent/children/blocking relations." so the
`description` in issue.relations.view.yaml matches the `.view` semantics.

52-54: Inconsistent naming between capability and GraphQL operation

The capability ID uses view as the action semantic, but the underlying GraphQL operation is named IssueRelationsGet. While the references are correct and functional, consider renaming the operation to IssueRelationsView (including the GraphQL file and generated types) to maintain naming coherence across the system, or add a comment explaining this intentional semantic distinction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/issue.relations.view.yaml` around lines
52 - 54, Capability/action naming is inconsistent: the capability ID uses "view"
while the GraphQL operation is named IssueRelationsGet; to fix, rename the
GraphQL operation to IssueRelationsView (update the operationName in the GraphQL
document from IssueRelationsGet to IssueRelationsView), rename the file
src/gql/operations/issue-relations-get.graphql to issue-relations-view.graphql,
regenerate the GraphQL types/clients so generated symbols match (replace any
generated types/names referencing IssueRelationsGet), and update all call sites
and imports to the new operation name; alternatively, if the distinction is
intentional add a brief comment in issue.relations.view.yaml explaining why the
operation uses "Get" instead of "View".
packages/core/src/core/execution/adapters/cli/domains/issue.ts (1)

338-396: Consider extracting shared issue-mutation preamble.

All four new handlers duplicate the owner/name → repo string construction + parseStrictPositiveInt(params.issueNumber) validation (Lines 340-347, 400-407, 460-467, 520-527). A small helper like parseIssueMutationParams(params, capabilityId) returning { repo, issueNumber } would reduce repetition. Not urgent — the existing read handlers use the same inline pattern too.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/issue.ts` around lines
338 - 396, Multiple issue-mutation CLI handlers (e.g., handleIssueLabelsRemove)
duplicate owner/name → repo construction and issueNumber validation with
parseStrictPositiveInt; extract this into a small helper (suggested name
parseIssueMutationParams) that accepts params and capabilityId and returns {
repo, issueNumber } (throwing a descriptive Error when issueNumber is
missing/invalid), then replace the repeated blocks in handleIssueLabelsRemove
and the three other issue-mutation handlers to call
parseIssueMutationParams(params, "issue.labels.remove") (or the appropriate
capabilityId) and use the returned repo and issueNumber values.
packages/core/test/integration/engine-issue-labels-update.integration.test.ts (1)

6-29: Consider renaming the test file to match the updated capability ID.

The internal test now targets issue.labels.set, but the file is still named engine-issue-labels-update.integration.test.ts. The newly added sibling engine-issue-labels-remove.integration.test.ts follows the naming-by-capability convention. Renaming to engine-issue-labels-set.integration.test.ts keeps the suite consistent and searchable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-issue-labels-update.integration.test.ts`
around lines 6 - 29, The test file name is inconsistent with the capability
being tested ("issue.labels.set"); rename the test file from
engine-issue-labels-update.integration.test.ts to
engine-issue-labels-set.integration.test.ts so it matches the describe block
"executeTask issue.labels.set" and the sibling test naming convention (e.g.,
engine-issue-labels-remove.integration.test.ts), ensuring consistency and
discoverability across the test suite.
packages/core/src/core/execution/adapters/cli/domains/pr.ts (1)

327-341: Client-side state filtering for pr.checks.list — verify upstream contract.

The new state parameter (Line 328) enables client-side filtering by "failed", "pending", or "passed", and each filtered item gets an empty annotations array appended. The summary counts still reflect the full (unfiltered) set, which is a useful design. However, note that any unexpected state value (e.g., a typo like "fail") silently returns all checks — consider whether that's acceptable or if a validation/warning is warranted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts` around lines 327
- 341, The code currently does client-side filtering based on params.state
(using failed, pending, passed, checks) and silently returns all checks for
unexpected values; update the logic in the pr.checks.list handling to validate
params.state against allowed values ("failed","pending","passed") before
applying the filteredItems mapping, and if the value is invalid either return a
validation error or emit a clear warning (e.g., via logger) rather than silently
defaulting to the full checks list; ensure itemsWithAnnotations and
normalizeResult behavior remains the same for valid states and include the
invalid-state handling adjacent to the existing filteredItems computation.
packages/core/test/unit/cli-domains-project-v2.test.ts (1)

938-1023: project_v2.items.issue.remove tests are nested inside the "missing params paths" describe block.

The new describe("project_v2.items.issue.remove", …) at Line 938 is a child of describe("missing params paths", …) (Line 843), but it contains full happy-path, arg-verification, and error-path tests — not just missing-param scenarios. This should be a sibling of the other top-level capability describe blocks (e.g., project_v2.items.issue.add, project_v2.items.field.update).

♻️ Move the describe block up one level

Move the describe("project_v2.items.issue.remove", …) block (lines 938–1023) out of the "missing params paths" describe and place it as a direct child of "project-v2 domain handlers", after the "missing params paths" block closes at the current line 1024.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-project-v2.test.ts` around lines 938 -
1023, The describe block for "project_v2.items.issue.remove" is currently nested
inside the "missing params paths" describe but contains full happy-path and
other tests; extract the entire describe("project_v2.items.issue.remove", …)
block and move it one level up so it is a sibling of the other capability
describes (e.g., project_v2.items.issue.add, project_v2.items.field.update)
under the "project-v2 domain handlers" top-level describe, placing it
immediately after the closing of the "missing params paths" block; ensure the
block retains its contents (all its it(...) tests) and update indentation only.

Comment on lines +59 to +61
```
"prompt": "Workflow run {{runId}} in {{owner}}/{{name}} has failed. Get the run details to confirm its status and conclusion, identify which job failed, and fetch that job's logs to surface the top error lines."
```
Copy link

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing language specifier on fenced code block.

📝 Proposed fix
-```
+```json
 "prompt": "Workflow run {{runId}} in..."
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-19-benchmark-scenario-composite-update-design.md` around
lines 59 - 61, The fenced code block containing the JSON-like snippet with the
"prompt" key is missing a language specifier; update the opening fence from ```
to ```json so the block (which includes the "prompt": "Workflow run {{runId}} in
{{owner}}/{{name}} has failed...") is marked as JSON for proper syntax
highlighting and tooling.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. This is a historical design doc — not updating it as it is a planning artifact, not live documentation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aryeko, understood! That makes perfect sense — historical design docs serve as a record of decisions made at a specific point in time, and minor formatting updates aren't necessary for that purpose. I appreciate the clarification on the distinction between planning artifacts and live documentation.


✏️ Learnings added
Learnt from: aryeko
Repo: aryeko/ghx PR: 58
File: docs/plans/2026-02-19-benchmark-scenario-composite-update-design.md:59-61
Timestamp: 2026-02-19T23:54:40.378Z
Learning: In the aryeko/ghx repository, files in `docs/plans/` are historical design documents and planning artifacts. Minor formatting issues (such as missing language specifiers on fenced code blocks) in these historical documents do not need to be fixed, as they serve as a historical record rather than live documentation.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/core/src/core/registry/cards/pr.review.submit.yaml:6-14
Timestamp: 2026-02-19T00:35:58.894Z
Learning: Preference: For packages/core/src/core/registry/cards/pr.review.submit.yaml, avoid JSON Schema if/then right now; rely on GitHub 422 for missing body on COMMENT/REQUEST_CHANGES and consider enforcement later as a follow-up enhancement.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/core/src/core/execution/adapters/cli-capability-adapter.ts:68-68
Timestamp: 2026-02-19T00:35:24.707Z
Learning: Repo: aryeko/ghx PR: 46
In packages/core/src/core/execution/adapters/cli-capability-adapter.ts, pr.create is intentionally included in NON_JSON_STDOUT_CAPABILITIES because `gh pr create` emits a plain-text URL (not JSON). As of PR `#46`, normalizeCliData("pr.create") returns a stub (number: 0, url: ""), and improving this (e.g., capturing stdout URL or parsing PR number) is a known limitation deferred to a future PR.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/core/src/core/execution/adapters/cli-capability-adapter.ts:522-540
Timestamp: 2026-02-19T00:36:21.068Z
Learning: In packages/core/src/core/execution/adapters/cli-capability-adapter.ts, some CLI fallback branches intentionally validate inputs that are primarily consumed by the GraphQL-preferred route to keep a consistent, route-agnostic input contract. Example: for capability "pr.diff.files", the CLI path validates `first` even though `gh pr view --json files` cannot limit results.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/benchmark/test/unit/scenario-loader.test.ts:29-31
Timestamp: 2026-02-19T00:38:03.159Z
Learning: In aryeko/ghx, the pr.view capability card expects input fields { owner, name, prNumber }. Tests and checkpoints must use prNumber (not number); the GraphQL adapter handles mapping prNumber → $number internally.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/benchmark/test/unit/scenario-loader.test.ts:29-31
Timestamp: 2026-02-19T00:38:03.159Z
Learning: In aryeko/ghx, the pr.view capability input is { owner, name, prNumber } and pr-view.graphql uses $prNumber as well. Tests/checkpoints must use prNumber (not number) in verification_input to match the card/engine.

Comment on lines +13 to +29
### Task 1: Update `pr-fix-review-comments-wf-001`

Replace atomic thread capabilities with `pr.threads.composite`.

**Files:**
- Modify: `packages/benchmark/scenarios/workflows/pr-fix-review-comments-wf-001.json`

**Step 1: Edit `expected_capabilities`**

Open the file. Change:
```json
"expected_capabilities": ["pr.view", "pr.thread.list", "pr.thread.reply", "pr.thread.resolve"]
```
To:
```json
"expected_capabilities": ["pr.view", "pr.thread.list", "pr.threads.composite"]
```
Copy link

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Plan doc is stale — diverges from actual implementation.

Task 1 proposes using pr.threads.composite, but the actual pr-fix-review-comments-wf-001.json (reviewed above) uses atomic plural capabilities ["pr.view", "pr.threads.list", "pr.threads.reply", "pr.threads.resolve"]. The PR description also states that composite infrastructure is retired. Similarly, Task 2 references issue.triage.composite which is listed as retired in the PR objectives.

Consider either updating this plan doc to reflect the final implementation or adding a note at the top indicating it's superseded.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 13-13: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-19-benchmark-scenario-composite-update.md` around lines 13
- 29, The plan doc is stale: it instructs swapping atomic capabilities to
pr.threads.composite and references issue.triage.composite even though the
actual workflow pr-fix-review-comments-wf-001.json uses atomic plural
capabilities (pr.threads.list, pr.threads.reply, pr.threads.resolve) and the
composite infra is retired; update the plan to match the implementation by
restoring Task 1 to list the atomic capabilities ["pr.view", "pr.threads.list",
"pr.threads.reply", "pr.threads.resolve"] (not pr.threads.composite), remove or
revise any Task 2 references to issue.triage.composite, and add a top-note
stating the composite capability approach is retired and this doc is superseded
by the current workflow implementation.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. This is a historical planning document committed as part of the design process — the implementation diverged from the initial plan intentionally. Not updating it as it is a planning artifact, not live documentation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aryeko, thank you for the clarification! That makes perfect sense — keeping the planning document as a historical artifact to show how the design evolved is valuable. I appreciate you explaining that the implementation diverged intentionally.


✏️ Learnings added
Learnt from: aryeko
Repo: aryeko/ghx PR: 58
File: docs/plans/2026-02-19-benchmark-scenario-composite-update.md:13-29
Timestamp: 2026-02-19T23:54:52.854Z
Learning: In the aryeko/ghx repository, documents in the docs/plans/ directory are historical planning artifacts that may intentionally diverge from the final implementation. These documents capture the design process and should not be updated to match the actual code, as they serve as a record of how decisions evolved.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/core/src/core/execution/adapters/cli-capability-adapter.ts:68-68
Timestamp: 2026-02-19T00:35:24.707Z
Learning: Repo: aryeko/ghx PR: 46
In packages/core/src/core/execution/adapters/cli-capability-adapter.ts, pr.create is intentionally included in NON_JSON_STDOUT_CAPABILITIES because `gh pr create` emits a plain-text URL (not JSON). As of PR `#46`, normalizeCliData("pr.create") returns a stub (number: 0, url: ""), and improving this (e.g., capturing stdout URL or parsing PR number) is a known limitation deferred to a future PR.

Learnt from: aryeko
Repo: aryeko/ghx PR: 9
File: packages/core/src/gql/client.ts:771-874
Timestamp: 2026-02-14T06:03:07.595Z
Learning: In packages/core/src/gql/client.ts, the runPrCommentsList function's `includeOutdated` parameter is intentionally only applied when `unresolvedOnly` is true. When `unresolvedOnly` is false, outdated threads are always included regardless of the `includeOutdated` setting. This is by design per the capability design doc semantics for the pr.comments.list task.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/core/src/core/execution/adapters/cli-capability-adapter.ts:522-540
Timestamp: 2026-02-19T00:36:21.068Z
Learning: In packages/core/src/core/execution/adapters/cli-capability-adapter.ts, some CLI fallback branches intentionally validate inputs that are primarily consumed by the GraphQL-preferred route to keep a consistent, route-agnostic input contract. Example: for capability "pr.diff.files", the CLI path validates `first` even though `gh pr view --json files` cannot limit results.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/benchmark/test/unit/scenario-loader.test.ts:29-31
Timestamp: 2026-02-19T00:38:03.159Z
Learning: In aryeko/ghx, the pr.view capability card expects input fields { owner, name, prNumber }. Tests and checkpoints must use prNumber (not number); the GraphQL adapter handles mapping prNumber → $number internally.

Learnt from: aryeko
Repo: aryeko/ghx PR: 46
File: packages/benchmark/test/unit/scenario-loader.test.ts:29-31
Timestamp: 2026-02-19T00:38:03.159Z
Learning: In aryeko/ghx, the pr.view capability input is { owner, name, prNumber } and pr-view.graphql uses $prNumber as well. Tests/checkpoints must use prNumber (not number) in verification_input to match the card/engine.

Comment on lines 70 to 74
return {
id: comment.id,
commentId: comment.id,
commentUrl: typeof comment.url === "string" ? comment.url : "",
}
Copy link

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

fd "pr.threads.reply" --type f

Repository: aryeko/ghx

Length of output: 113


🏁 Script executed:

cat packages/core/src/core/registry/cards/pr.threads.reply.yaml

Repository: aryeko/ghx

Length of output: 753


🏁 Script executed:

cat -n packages/core/src/gql/builders.ts | grep -A 20 "replyBuilder"

Repository: aryeko/ghx

Length of output: 1272


🏁 Script executed:

sed -n '63,80p' packages/core/src/gql/builders.ts

Repository: aryeko/ghx

Length of output: 545


Add missing isResolved field to the response object.

The mapResponse function returns only {id, commentId, commentUrl}, but the pr.threads.reply output schema requires [id, isResolved, commentId, commentUrl]. The missing isResolved field will cause schema validation to fail at runtime since additionalProperties: false. Determine the correct value for isResolved from the mutation response and include it in the return object.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/gql/builders.ts` around lines 70 - 74, The returned object
from mapResponse is missing the required isResolved field; update the
mapResponse function to include isResolved by reading it from the mutation
response (e.g., use comment.isResolved or the exact boolean property provided by
the mutation payload) and return it (fallback to false only if the response
property is absent), so the returned shape matches [id, isResolved, commentId,
commentUrl] required by pr.threads.reply and satisfies the schema with
additionalProperties: false.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ba90db1. Added isResolved: false to the replyBuilder.mapResponse return value in builders.ts. While this code path is not currently used by production routing (the live path goes through capability-registry.tsrunReplyToReviewThread), keeping the builder consistent avoids confusion and future bugs if it is ever activated. Updated the corresponding test assertion in gql-builders.test.ts.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@aryeko have exceeded the limit for the number of chat messages per hour. Please wait 2 minutes and 5 seconds before sending another message.

@@ -3,7 +3,7 @@ import { executeTask } from "@core/core/routing/engine.js"
import { createGithubClient } from "@core/gql/github-client.js"
import { describe, expect, it } from "vitest"

describe("executeTask issue.assignees.update", () => {
describe("executeTask issue.assignees.set", () => {
Copy link

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for engine-issue-assignees-set integration test
fd "engine-issue-assignees-set.integration.test.ts" 2>/dev/null || echo "No set variant found"

# List all assignees-related integration test files
echo "=== All assignees-related test files ==="
fd "engine-issue-assignees.*test.ts" --type f

# Show the actual content of the update test to confirm it tests 'set'
echo ""
echo "=== Content of update test file (lines 1-20) ==="
head -20 "packages/core/test/integration/engine-issue-assignees-update.integration.test.ts" 2>/dev/null || echo "File not found"

Repository: aryeko/ghx

Length of output: 1064


Rename file to engine-issue-assignees-set.integration.test.ts.

File name still references the retired issue.assignees.update capability, but the test content now exercises issue.assignees.set (as shown in the describe block and task field). Align the file name with the tested capability and the naming convention used by sibling tests (add, remove).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-issue-assignees-update.integration.test.ts`
at line 6, The test file is misnamed for the capability it exercises; rename the
file from engine-issue-assignees-update.integration.test.ts to
engine-issue-assignees-set.integration.test.ts so the filename matches the
describe block "executeTask issue.assignees.set" and the tested task; after
renaming, update any references/imports or test-suite patterns that reference
the old filename (if any) so test discovery continues to work.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ba90db1. Renamed engine-issue-assignees-update.integration.test.tsengine-issue-assignees-set.integration.test.ts via git mv to match the renamed capability ID. The task reference inside the test was already correct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@aryeko have exceeded the limit for the number of chat messages per hour. Please wait 1 minutes and 56 seconds before sending another message.

Comment on lines +6 to +7
describe("executeTask issue.relations.prs.list", () => {
it("returns graphql envelope for issue.relations.prs.list", async () => {
Copy link

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List all integration test files to see naming patterns
fd '\.integration\.test\.ts$' packages/core/test/integration/ -x basename {} | sort

Repository: aryeko/ghx

Length of output: 3080


🏁 Script executed:

#!/bin/bash
# Check if there are other similar files with dot-notation capabilities
# Search for files containing "issue" or related names
fd 'issue.*integration.*test\.ts$|.*prs.*integration.*test\.ts$' packages/core/test/ -x ls -la {}

Repository: aryeko/ghx

Length of output: 3318


🏁 Script executed:

#!/bin/bash
# Look at the content of the file being reviewed to confirm the capability ID
head -20 packages/core/test/integration/engine-issue-linked-prs-list.integration.test.ts

Repository: aryeko/ghx

Length of output: 899


File name does not match the renamed capability.

The file is named engine-issue-linked-prs-list.integration.test.ts but now tests the issue.relations.prs.list capability. This creates a discoverability mismatch — searching for issue.relations.prs.list integration tests by file name will not surface this file.

Rename to engine-issue-relations-prs-list.integration.test.ts to align with the kebab-case naming pattern used throughout the directory and match the sibling file engine-issue-relations-get.integration.test.ts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-issue-linked-prs-list.integration.test.ts`
around lines 6 - 7, The test file name doesn't match the capability name; rename
the test file from engine-issue-linked-prs-list.integration.test.ts to
engine-issue-relations-prs-list.integration.test.ts so it follows the kebab-case
pattern and matches the capability string in the describe("executeTask
issue.relations.prs.list", ...) line; after renaming, update any imports or test
runner references that pointed to the old filename (and run the test suite to
verify everything still resolves).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ba90db1. Renamed engine-issue-linked-prs-list.integration.test.tsengine-issue-relations-prs-list.integration.test.ts via git mv to match the renamed capability ID.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@aryeko have exceeded the limit for the number of chat messages per hour. Please wait 1 minutes and 51 seconds before sending another message.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/gql/types.ts (1)

62-65: ⚠️ Potential issue | 🟡 Minor

Remove | null from the n input type — it contradicts the YAML schema.

The issue.milestone.set YAML card's input_schema requires milestoneNumber to be { type: integer, minimum: 1 }, which rejects null at the boundary. The separate issue.milestone.clear capability handles clearing, so null should never reach the GraphQL layer. However, the TypeScript type n still declares milestoneNumber: number | null, which is misleading. Tighten it to milestoneNumber: number to match the actual contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/gql/types.ts` around lines 62 - 65, Update the
IssueMilestoneSetInput type to remove the nullable union so it matches the YAML
contract: change the declaration of milestoneNumber on IssueMilestoneSetInput
from "number | null" to "number" (the clearing behavior is handled by the
separate issue.milestone.clear capability), ensuring the GraphQL input type
reflects the required integer minimum defined in the issue.milestone.set
input_schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/core/src/gql/types.ts`:
- Around line 62-65: Update the IssueMilestoneSetInput type to remove the
nullable union so it matches the YAML contract: change the declaration of
milestoneNumber on IssueMilestoneSetInput from "number | null" to "number" (the
clearing behavior is handled by the separate issue.milestone.clear capability),
ensuring the GraphQL input type reflects the required integer minimum defined in
the issue.milestone.set input_schema.

---

Duplicate comments:
In
`@packages/core/test/integration/engine-project-v2-items-issue-remove.integration.test.ts`:
- Around line 6-53: Add a positive "happy path" test for the task
"project_v2.items.issue.remove" that supplies both projectId and itemId, uses
createGithubClient and executeTask, sets ghCliAvailable: true (and
ghAuthenticated if needed), and mocks the CLI executor used by the test harness
to return the expected JSON payload so executeTask resolves successfully; assert
result.ok === true and that result.value equals { itemId: "<same id>", removed:
true } to validate end-to-end wiring and schema.

- Fix project_v2.items.issue.remove: rewrite handler to use correct
  gh project item-delete syntax (positional projectNumber + --owner/--id)
  and update YAML schema to match (owner, projectNumber, itemId)
- Fix pr.checks.list: strip internal `bucket` field from output and
  include `annotations: []` so output validates against schema
  (additionalProperties: false was failing at runtime)
- Add isResolved: false to builders.ts replyBuilder.mapResponse for
  consistency with live runReplyToReviewThread path
- Add minLength/minimum constraints to YAML output schemas for
  issue.assignees.remove, pr.checks.rerun.all, workflow.run.rerun.*
- Fix workflow.view description (Get → View)
- Remove retired Composite Capabilities section from SKILL.md
- Update benchmark scenarios: fix stale capability IDs in
  ci-diagnose-run-wf-001 and issue-triage-comment-wf-001
- Fix docs/capabilities/pull-requests.md output field names for
  pr.assignees.add/remove
- Fix changeset: runId type integer (not string) in rerun outputs
- Rename integration test files to match new capability IDs
- Update tests for new project_v2.items.issue.remove param shape
  and remove bucket from pr.checks.list assertions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
packages/benchmark/scenarios/workflows/ci-diagnose-run-wf-001.json (1)

20-27: ⚠️ Potential issue | 🟡 Minor

Missing checkpoint for the newly added workflow.job.logs.view capability.

workflow.job.logs.view is declared in expected_capabilities and the prompt explicitly requires fetching logs, but assertions.checkpoints only validates workflow.run.view. The log-fetch step is therefore never exercised/asserted by the test harness, making coverage incomplete for the new capability.

Consider adding a second checkpoint, e.g.:

🛠️ Proposed addition
       {
         "name": "run_viewed",
         "verification_task": "workflow.run.view",
         "verification_input": {},
         "condition": "non_empty"
-      }
+      },
+      {
+        "name": "job_logs_fetched",
+        "verification_task": "workflow.job.logs.view",
+        "verification_input": {},
+        "condition": "non_empty"
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/benchmark/scenarios/workflows/ci-diagnose-run-wf-001.json` around
lines 20 - 27, The checkpoints array currently only asserts the
"workflow.run.view" capability (checkpoint "run_viewed"); add a second
checkpoint object that exercises the newly declared "workflow.job.logs.view"
capability so the log-fetch step is validated—e.g., add a checkpoint with name
"job_logs_viewed", verification_task "workflow.job.logs.view",
verification_input {} and condition "non_empty" into the same "checkpoints"
array to ensure the logs fetch is executed and asserted by the test harness.
docs/capabilities/pull-requests.md (2)

348-365: ⚠️ Potential issue | 🟡 Minor

Docs for pr.threads.reply are missing the new output fields.
commentId and commentUrl were added to the reply output, but the table only lists id and isResolved. Please add the new fields to prevent confusion for consumers.

✏️ Suggested doc tweak
 | Field | Type | Description |
 |-------|------|-------------|
 | id | string | Thread ID |
 | isResolved | boolean | Current resolution state |
+| commentId | string | Reply comment ID |
+| commentUrl | string | Reply comment URL |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/pull-requests.md` around lines 348 - 365, Update the docs
for pr.threads.reply to reflect the new output fields: add commentId (string)
and commentUrl (string) to the Output table alongside the existing id and
isResolved entries; ensure the descriptions clarify that commentId is the
created comment's ID and commentUrl is a permalink to the created comment so
consumers know what to expect from pr.threads.reply.

158-181: ⚠️ Potential issue | 🟡 Minor

Update pr.update output field from isDraft to draft.

The registry card and CLI adapter return draft as the output field name, not isDraft. The docs are out of sync.

Evidence
  • Registry: packages/core/src/core/registry/cards/pr.update.yaml defines output_schema with draft: { type: boolean }
  • CLI Handler: packages/core/src/core/execution/adapters/cli/domains/pr.ts line 276 returns draft: ...
  • Tests: packages/core/test/unit/cli-capability-adapter.test.ts line 789 asserts result.data?.draft
-| isDraft | boolean | Current draft state |
+| draft | boolean | Current draft state |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/capabilities/pull-requests.md` around lines 158 - 181, The docs for the
capability pr.update are out-of-sync: change the Output table to return "draft"
(boolean) instead of "isDraft" so it matches the registry card's output_schema
(draft) and the CLI adapter that returns draft (and the tests asserting
result.data?.draft); update the pr.update docs' Output field name and
description accordingly to use "draft" consistently.
packages/core/src/core/execution/adapters/cli/domains/pr.ts (1)

460-462: ⚠️ Potential issue | 🟡 Minor

Stale capability name in error message: pr.review.submitpr.reviews.submit.

Line 462 still references the old singular form while every other reference in this handler was updated to pr.reviews.submit.

Proposed fix
-      if (body === null) throw new Error("Missing or invalid body for pr.review.submit")
+      if (body === null) throw new Error("Missing or invalid body for pr.reviews.submit")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts` around lines 460
- 462, The error message uses the stale capability name "pr.review.submit";
update the thrown Error to reference the new capability name "pr.reviews.submit"
so it matches the rest of the handler — locate the branch where
parseNonEmptyString(params.body) is validated (the body variable check) and
replace the string inside the Error(...) from "Missing or invalid body for
pr.review.submit" to "Missing or invalid body for pr.reviews.submit".
🧹 Nitpick comments (6)
packages/core/test/integration/engine-issue-assignees-set.integration.test.ts (1)

7-29: Consider adding a happy-path test case for issue.assignees.set.

The suite currently only asserts the INPUT_VALIDATION path (missing issueId). Since this is a new capability introduced in this PR, a second it block covering a successful execution (stubbing the GitHub client to return a valid response and asserting result.ok === true) would give the surface meaningful end-to-end coverage and catch regressions in the handler/registry wiring.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/core/test/integration/engine-issue-assignees-set.integration.test.ts`
around lines 7 - 29, Add a happy-path integration test for the
"issue.assignees.set" task: create a second it block that constructs a
TaskRequest with task: "issue.assignees.set", input including issueId and
logins, stub createGithubClient().execute to return a successful GitHub response
for assigning assignees, call executeTask with that githubClient (and the same
options as the existing test), and assert result.ok === true and any expected
success shape in result.value; use the same helper types (TaskRequest) and
functions (createGithubClient, executeTask) so the test verifies
handler/registry wiring.
packages/core/src/core/registry/cards/workflow.run.rerun.failed.yaml (1)

1-1: Filename uses dots instead of kebab-case (guideline conflict).

The coding guideline requires kebab-case for all *.yaml files, which would make the correct filename workflow-run-rerun-failed.yaml. However, every registry card in this directory uses dots to mirror its capability_id, so this is a project-wide deliberate convention rather than an isolated oversight. No action needed within this PR, but worth noting if a future audit aligns the guideline with the convention (or vice-versa).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/workflow.run.rerun.failed.yaml` at line
1, The filename uses dots
(packages/core/src/core/registry/cards/workflow.run.rerun.failed.yaml) but the
guideline mandates kebab-case; rename the file to
workflow-run-rerun-failed.yaml, ensure the capability_id entry stays as
"workflow.run.rerun.failed", and update any references/imports or registry
indexes that point to the old filename so the registry continues to load the
card.
packages/core/src/core/registry/cards/issue.assignees.remove.yaml (1)

11-14: GitHub's assignee limit is confirmed at 10 per request, but this validation pattern is not applied schema-wide.

The API does enforce a 10-assignee maximum, and all sibling assignee cards (issue.assignees.add, pr.assignees.add, pr.assignees.remove) follow the same pattern of omitting maxItems. Based on the team's preference to rely on GitHub's API for limit enforcement rather than schema-level constraints, adding maxItems is not aligned with the current design approach. The suggestion is technically valid, but if applied, should be done consistently across all assignee operation cards as a follow-up refactor, not in isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/registry/cards/issue.assignees.remove.yaml` around
lines 11 - 14, Do not add a maxItems constraint to the assignees array in
issue.assignees.remove.yaml; leave the current schema (assignees: type: array,
items: string, minItems: 1) unchanged to remain consistent with the design
decision to rely on GitHub's API for limits. If you decide to enforce a
10-assignee max later, apply that change consistently across all sibling cards
(issue.assignees.add, pr.assignees.add, pr.assignees.remove) in a single
refactor rather than modifying only issue.assignees.remove.yaml.
packages/core/src/core/execution/adapters/cli/domains/project-v2.ts (2)

16-16: Optional: stale Get suffix in internal function names after capability rename to .view.

handleProjectV2OrgGet and handleProjectV2UserGet are now registered under project_v2.org.view / project_v2.user.view. The internal names still carry the old Get suffix, creating a minor gap between the capability ID and its handler name.

♻️ Suggested rename
-const handleProjectV2OrgGet: CliHandler = async (runner, params, _card) => {
+const handleProjectV2OrgView: CliHandler = async (runner, params, _card) => {
-const handleProjectV2UserGet: CliHandler = async (runner, params, _card) => {
+const handleProjectV2UserView: CliHandler = async (runner, params, _card) => {
-  "project_v2.org.view": handleProjectV2OrgGet,
-  "project_v2.user.view": handleProjectV2UserGet,
+  "project_v2.org.view": handleProjectV2OrgView,
+  "project_v2.user.view": handleProjectV2UserView,

Also applies to: 74-74

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts` at line
16, Rename the stale handler names to match the new capability `.view`: change
handleProjectV2OrgGet to handleProjectV2OrgView and handleProjectV2UserGet to
handleProjectV2UserView, and update any places that reference these symbols
(registrations, exports, or internal calls) so the function identifiers and
their usages stay in sync with the capability IDs project_v2.org.view and
project_v2.user.view.

304-304: Optional: plural/singular inconsistency between handleProjectV2ItemAddIssue and handleProjectV2ItemsIssueRemove.

The add handler uses singular Item while the new remove handler uses plural Items. Both are registered under project_v2.items.* capability IDs.

♻️ Suggested rename
-const handleProjectV2ItemAddIssue: CliHandler = async (runner, params, _card) => {
+const handleProjectV2ItemsIssueAdd: CliHandler = async (runner, params, _card) => {
-  "project_v2.items.issue.add": handleProjectV2ItemAddIssue,
+  "project_v2.items.issue.add": handleProjectV2ItemsIssueAdd,

Also applies to: 467-523

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts` at line
304, The handler naming is inconsistent: keep plurality consistent between
handleProjectV2ItemAddIssue and handleProjectV2ItemsIssueRemove (both are
registered under the project_v2.items.* capability IDs). Rename one so both use
the same singular/plural form (e.g., rename handleProjectV2ItemAddIssue →
handleProjectV2ItemsAddIssue or handleProjectV2ItemsIssueRemove →
handleProjectV2ItemIssueRemove), and update any references/registrations where
the handler name is used to match the new name (check the registration code
around the project_v2.items.* handlers and the implementations at both locations
referenced in the diff).
packages/core/test/unit/cli-domains-pr.test.ts (1)

273-281: Consider asserting annotations: [] on output items.

The handler now explicitly adds annotations: [] to each item and strips bucket, but this test only asserts name and state via toMatchObject. Since annotations was explicitly added as a schema fix in this PR, verifying its presence would prevent regressions.

Suggested assertion addition
       expect(result.data).toMatchObject({
         items: [
-          { name: "test", state: "PASS" },
-          { name: "lint", state: "FAIL" },
-          { name: "build", state: "PENDING" },
+          { name: "test", state: "PASS", annotations: [] },
+          { name: "lint", state: "FAIL", annotations: [] },
+          { name: "build", state: "PENDING", annotations: [] },
         ],
         summary: { total: 3, failed: 1, pending: 1, passed: 1 },
       })

You could also add an assertion that bucket is absent from items:

for (const item of (result.data as { items: Record<string, unknown>[] }).items) {
  expect(item).not.toHaveProperty("bucket")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-pr.test.ts` around lines 273 - 281, The
test currently only verifies name and state for items in result.data but the
handler now explicitly adds annotations: [] and strips bucket; update the
assertion in packages/core/test/unit/cli-domains-pr.test.ts (the expectations
around result.data.items) to also assert each item has annotations: [] and does
not have a bucket property (e.g., iterate over (result.data as { items:
Record<string, unknown>[] }).items and expect item.annotations toEqual [] and
expect(item).not.toHaveProperty('bucket')).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@docs/capabilities/pull-requests.md`:
- Around line 348-365: Update the docs for pr.threads.reply to reflect the new
output fields: add commentId (string) and commentUrl (string) to the Output
table alongside the existing id and isResolved entries; ensure the descriptions
clarify that commentId is the created comment's ID and commentUrl is a permalink
to the created comment so consumers know what to expect from pr.threads.reply.
- Around line 158-181: The docs for the capability pr.update are out-of-sync:
change the Output table to return "draft" (boolean) instead of "isDraft" so it
matches the registry card's output_schema (draft) and the CLI adapter that
returns draft (and the tests asserting result.data?.draft); update the pr.update
docs' Output field name and description accordingly to use "draft" consistently.

In `@packages/benchmark/scenarios/workflows/ci-diagnose-run-wf-001.json`:
- Around line 20-27: The checkpoints array currently only asserts the
"workflow.run.view" capability (checkpoint "run_viewed"); add a second
checkpoint object that exercises the newly declared "workflow.job.logs.view"
capability so the log-fetch step is validated—e.g., add a checkpoint with name
"job_logs_viewed", verification_task "workflow.job.logs.view",
verification_input {} and condition "non_empty" into the same "checkpoints"
array to ensure the logs fetch is executed and asserted by the test harness.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts`:
- Around line 460-462: The error message uses the stale capability name
"pr.review.submit"; update the thrown Error to reference the new capability name
"pr.reviews.submit" so it matches the rest of the handler — locate the branch
where parseNonEmptyString(params.body) is validated (the body variable check)
and replace the string inside the Error(...) from "Missing or invalid body for
pr.review.submit" to "Missing or invalid body for pr.reviews.submit".

---

Duplicate comments:
In `@packages/core/src/core/registry/cards/issue.assignees.remove.yaml`:
- Around line 21-23: The output schema for the "removed" field already enforces
non-empty strings via items: { type: string, minLength: 1 } in the
issue.assignees.remove card, so no code change is required; leave the "removed"
array definition (items with minLength: 1) as-is to ensure empty strings are
rejected.

In
`@packages/core/test/integration/engine-issue-assignees-set.integration.test.ts`:
- Line 6: Describe block and filename already match the capability ID; no
changes required. Leave the describe("executeTask issue.assignees.set", ...)
as-is in engine-issue-assignees-set.integration.test.ts and proceed with the
approval/merge.

In
`@packages/core/test/integration/engine-issue-relations-prs-list.integration.test.ts`:
- Around line 6-7: No changes required — the test file and task identifiers have
been correctly renamed to match the capability ID; confirm that the describe/it
labels and the `task` fields inside the test (references to
"issue.relations.prs.list") are consistent and leave the file
`engine-issue-relations-prs-list.integration.test.ts` as-is.

---

Nitpick comments:
In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts`:
- Line 16: Rename the stale handler names to match the new capability `.view`:
change handleProjectV2OrgGet to handleProjectV2OrgView and
handleProjectV2UserGet to handleProjectV2UserView, and update any places that
reference these symbols (registrations, exports, or internal calls) so the
function identifiers and their usages stay in sync with the capability IDs
project_v2.org.view and project_v2.user.view.
- Line 304: The handler naming is inconsistent: keep plurality consistent
between handleProjectV2ItemAddIssue and handleProjectV2ItemsIssueRemove (both
are registered under the project_v2.items.* capability IDs). Rename one so both
use the same singular/plural form (e.g., rename handleProjectV2ItemAddIssue →
handleProjectV2ItemsAddIssue or handleProjectV2ItemsIssueRemove →
handleProjectV2ItemIssueRemove), and update any references/registrations where
the handler name is used to match the new name (check the registration code
around the project_v2.items.* handlers and the implementations at both locations
referenced in the diff).

In `@packages/core/src/core/registry/cards/issue.assignees.remove.yaml`:
- Around line 11-14: Do not add a maxItems constraint to the assignees array in
issue.assignees.remove.yaml; leave the current schema (assignees: type: array,
items: string, minItems: 1) unchanged to remain consistent with the design
decision to rely on GitHub's API for limits. If you decide to enforce a
10-assignee max later, apply that change consistently across all sibling cards
(issue.assignees.add, pr.assignees.add, pr.assignees.remove) in a single
refactor rather than modifying only issue.assignees.remove.yaml.

In `@packages/core/src/core/registry/cards/workflow.run.rerun.failed.yaml`:
- Line 1: The filename uses dots
(packages/core/src/core/registry/cards/workflow.run.rerun.failed.yaml) but the
guideline mandates kebab-case; rename the file to
workflow-run-rerun-failed.yaml, ensure the capability_id entry stays as
"workflow.run.rerun.failed", and update any references/imports or registry
indexes that point to the old filename so the registry continues to load the
card.

In
`@packages/core/test/integration/engine-issue-assignees-set.integration.test.ts`:
- Around line 7-29: Add a happy-path integration test for the
"issue.assignees.set" task: create a second it block that constructs a
TaskRequest with task: "issue.assignees.set", input including issueId and
logins, stub createGithubClient().execute to return a successful GitHub response
for assigning assignees, call executeTask with that githubClient (and the same
options as the existing test), and assert result.ok === true and any expected
success shape in result.value; use the same helper types (TaskRequest) and
functions (createGithubClient, executeTask) so the test verifies
handler/registry wiring.

In `@packages/core/test/unit/cli-domains-pr.test.ts`:
- Around line 273-281: The test currently only verifies name and state for items
in result.data but the handler now explicitly adds annotations: [] and strips
bucket; update the assertion in packages/core/test/unit/cli-domains-pr.test.ts
(the expectations around result.data.items) to also assert each item has
annotations: [] and does not have a bucket property (e.g., iterate over
(result.data as { items: Record<string, unknown>[] }).items and expect
item.annotations toEqual [] and expect(item).not.toHaveProperty('bucket')).

…n test

- Remove `| null` from IssueMilestoneSetInput.milestoneNumber in types.ts
  to match the YAML schema (integer minimum:1, no null); issue.milestone.clear
  owns the clearing concern
- Remove dead null branch from runIssueMilestoneSet and tighten
  assertIssueMilestoneSetInput to reject null unconditionally
- Add ADAPTER_UNSUPPORTED integration test for project_v2.items.issue.remove
  to match the standard two-test pattern used by other CLI-only capabilities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aryeko
Copy link
Owner Author

aryeko commented Feb 20, 2026

Follow-up fix (6560a6c): Addressed the outside-diff CodeRabbit finding on packages/core/src/gql/types.ts:62-65.

  • IssueMilestoneSetInput.milestoneNumber changed from number | null to number — the YAML schema for issue.milestone.set requires { type: integer, minimum: 1 } and issue.milestone.clear owns the null/clearing path, so null should never reach the GraphQL layer.
  • Removed the dead \!== null guard in assertIssueMilestoneSetInput and tightened the error message.
  • Removed the dead if (input.milestoneNumber \!== null) branch in runIssueMilestoneSet — the lookup now runs unconditionally.

@aryeko aryeko merged commit d3564e2 into main Feb 20, 2026
5 of 6 checks passed
@aryeko aryeko deleted the feat/benchmark-scenario-composite-update branch February 20, 2026 00:13
aryeko added a commit that referenced this pull request Feb 20, 2026
Resolved conflicts from feat(core): capability namespace standardization:
- registry/index.ts: use renamed relation capability IDs
- registry/types.ts: keep InjectSpec/ResolutionConfig types (needed by chaining)
- routing/engine.ts: keep mapErrorToCode + batch/resolve imports for executeTasks
- test files: adopt renamed IDs, fix issue.labels.update → issue.labels.set in
  resolution tests, keep executeTasks chaining test suite

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aryeko added a commit that referenced this pull request Feb 20, 2026
…ename

Update docs/capabilities/README.md to reflect all renames and additions
from the capability namespace standardization (PR #58):

- Issues: 19 → 23; fix 8 stale IDs (labels.update→labels.set,
  assignees.update→assignees.set, parent.*→relations.parent.*,
  blocked_by.*→relations.blocked_by.*, linked_prs.list→relations.prs.list,
  relations.get→relations.view); add 4 missing atomics (labels.remove,
  assignees.add, assignees.remove, milestone.clear)
- Releases: create_draft→create, get→view, publish_draft→publish
- Workflows: get→view, dispatch.run→dispatch, rerun_all→rerun.all,
  rerun_failed→rerun.failed, job.logs.get→job.logs.view
- Projects V2: 6 → 7; org.get→org.view, user.get→user.view,
  item.add_issue→items.issue.add, item.field.update→items.field.update;
  add project_v2.items.issue.remove
- Mermaid graph: update counts, remove retired Check Runs node
- Domain docs: remove Check Runs entry (0 capabilities remaining)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aryeko added a commit that referenced this pull request Feb 20, 2026
#59)

* docs: add atomic capability chaining design

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify validation scope in chaining design

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: update chaining design — concurrent Promise.all approach + CLI interface

- Replace single-document GQL batching with concurrent Promise.all via
  getGraphqlHandler registry (handles multi-step caps naturally)
- Add ghx chain --steps CLI interface
- Remove buildBatchQuery/parseOperation from additions (no longer needed)
- Add composite schema + preferredOrder cleanup to deletions table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: update design doc + write implementation plan for card-driven two-phase batch chaining

* feat(core): add graphql.resolution schema and types to OperationCard

* feat(core): add graphql.resolution blocks to 6 capability cards

* feat(core): add buildBatchQuery and generalise parseOperation in batch.ts

* feat(core): add document-registry for lookup and mutation GQL documents

* feat(core): add ChainResultEnvelope, ChainStepResult, ChainStatus types

* feat(core): add resolution engine helpers (applyInject, buildMutationVars)

* fix(core): revert error message to 'mutation' for backward compatibility

* refactor(core): delete composite capability system

Remove the unused composite capability system (issue.triage.composite,
issue.update.composite, pr.threads.composite) and update all references.
Composite capabilities were an early approach to multi-step execution that
has been superseded by the new atomic chaining feature (executeTasks).

This includes:
- Delete composite.ts executor module
- Delete 3 composite YAML capability cards
- Remove composite fields from operation card schema and types
- Remove composite routing logic from engine
- Delete 5 composite test files
- Update registry and CLI tests to remove composite IDs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(core): export executeTasks and chain result types

Add public API exports for the new atomic chaining feature:
- executeTasks() function for executing multiple capabilities in one call
- ChainResultEnvelope, ChainStepResult, ChainStatus types for chain results

This enables agents to use the new chaining capability via the public API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add changeset for atomic capability chaining feature

Add changeset documenting the new atomic capability chaining feature and
removal of the unused composite system.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: update documentation for atomic capability chaining

- Remove all references to deleted composite capabilities
  (pr.threads.composite, issue.triage.composite, issue.update.composite)
- Update capability count from 69 → 66 throughout (Issues 21→19, PRs 22→21)
- Add Card-Defined Resolution section to operation-cards.md explaining
  the graphql.resolution block (lookup + inject, scalar/map_array variants)
- Add new guide: docs/guides/chaining-capabilities.md covering executeTasks()
  API, ghx chain CLI, ChainResultEnvelope/ChainStatus/ChainStepResult types,
  two-phase execution model with mermaid sequence diagram, and error handling
- Update library-api.md: add executeTasks to imports, replace batch section
  with atomic chain + parallel queries sub-sections
- Update cli-usage.md: replace batch ops section with ghx chain examples
- Update result-envelope.md: add Chain Result Envelope section
- Update agent-integration.md: add executeTasks as available tool
- Update getting-started/first-task.md: add Chain Mutations Atomically section
- Update getting-started/how-it-works.md: add chaining sequence diagram
- Update repository-structure.md: remove composite.ts, add document-registry.ts
  and resolve.ts rows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: preserve GraphQL fragments in batch mutations and fix labelableId injection

Bug 1 (document-registry.ts / batch.ts): IssueCreateDocument contains a
fragment spread (...IssueCoreFields) with the fragment definition appended
as a template literal. buildBatchMutation's parseOperation only extracted
the operation body, dropping the trailing fragment definition. GitHub would
reject the batched mutation with an unknown-fragment error.

Fix: parseOperation now extracts fragment definitions that appear after the
operation's closing brace. buildBatchMutation collects unique fragments
across all operations and appends them to the composed BatchComposite document.

Bug 2 (issue.labels.add.yaml): IssueLabelsAdd requires $labelableId but
the card input uses issueId. buildMutationVars passes through input fields
by name-matching mutation variables, so labelableId was never populated,
causing all chains containing issue.labels.add to fail at batch execution.

Fix: Add InputPassthroughInject (source: "input") to the InjectSpec union,
which reads a value directly from the step input rather than from the Phase 1
lookup result. Add corresponding handling in applyInject and a new oneOf
variant in the operation card JSON schema. Update issue.labels.add.yaml to
inject labelableId from the issueId input field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(capabilities): fix stale capability IDs after PR #58 namespace rename

Update docs/capabilities/README.md to reflect all renames and additions
from the capability namespace standardization (PR #58):

- Issues: 19 → 23; fix 8 stale IDs (labels.update→labels.set,
  assignees.update→assignees.set, parent.*→relations.parent.*,
  blocked_by.*→relations.blocked_by.*, linked_prs.list→relations.prs.list,
  relations.get→relations.view); add 4 missing atomics (labels.remove,
  assignees.add, assignees.remove, milestone.clear)
- Releases: create_draft→create, get→view, publish_draft→publish
- Workflows: get→view, dispatch.run→dispatch, rerun_all→rerun.all,
  rerun_failed→rerun.failed, job.logs.get→job.logs.view
- Projects V2: 6 → 7; org.get→org.view, user.get→user.view,
  item.add_issue→items.issue.add, item.field.update→items.field.update;
  add project_v2.items.issue.remove
- Mermaid graph: update counts, remove retired Check Runs node
- Domain docs: remove Check Runs entry (0 capabilities remaining)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(core): register 11 graphql mutations in document registry for atomic chaining

* feat(core): add pr.reviews.submit to atomic chaining with PrNodeId resolution

- Add resolution block to pr.reviews.submit.yaml: looks up pullRequestId
  via PrNodeId query (owner/name/prNumber → repository.pullRequest.id),
  injects pullRequestId into PrReviewSubmit mutation
- Register PrNodeId in LOOKUP_DOCUMENTS
- Register PrReviewSubmit in MUTATION_DOCUMENTS

pr.reviews.submit is now the 7th fully-wired batchable capability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add atomic chaining follow-up work tracker

Captures 8 areas of follow-up work identified after implementation:
SKILL.md updates, benchmark scenarios, GQL partial error handling,
cross-step data passing, CLI-only mutation expansion, resolution caching,
test coverage gaps, and pr.reviews.submit path validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chain): safe json parse, 30s abort timeout, catch sync throws in chainCommand

* fix(engine): track preflight errors by step index; validate lookup vars in input

* test(engine): fix vi.doMock, add attribution + lookup-var + mixed-resolution tests

- Replace ineffective vi.doMock tests with vi.resetModules() + dynamic import
  pattern so mocks actually intercept module-level singletons
- Add assertion that getMutationDocumentMock and buildBatchMutationMock are
  called (proves mocks are effective, not just that the function returns)
- Add CR-3 test: same capability twice, only one fails pre-flight → error
  attributed to the correct step index (not matched by task name)
- Add CL-4 test: lookup var missing from input → pre-flight rejects that step
- Add CL-7 test: mixed-resolution chain (step 0 has no resolution, step 1
  requires Phase 1 lookup) → both phases called, both steps succeed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(types): document scalar, map_array, and input inject source types

Add JSDoc comments to ScalarInject, MapArrayInject, and InputPassthroughInject
interfaces explaining when each is used, with YAML examples showing the
required fields. Expand the inject source variants table in operation-cards.md
to cover all three source types (including the previously undocumented `input`
passthrough) and add a tip recommending `source: input` when the caller already
has the node ID.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract LookupSpec interface and use @core/ alias imports in chain.ts

* docs: fix ResolutionBlock→ResolutionConfig, stale capability IDs, and plan table

- Replace ResolutionBlock with ResolutionConfig in operation-cards.md diagram to match actual TypeScript type name
- Update stale capability IDs across docs (except docs/plans/): issue.labels.update → issue.labels.set and issue.assignees.update → issue.assignees.set
- Remove issue.comments.create row from 2026-02-20-atomic-chaining-followup.md (already batchable, shouldn't be in non-batchable candidates table)

Files modified:
- docs/architecture/operation-cards.md: ResolutionBlock → ResolutionConfig in diagram
- docs/benchmark/workflow-roadmap.md: capability ID updates
- docs/capabilities/issues.md: capability ID updates
- docs/getting-started/README.md: capability ID updates
- docs/getting-started/first-task.md: capability ID updates
- docs/guides/chaining-capabilities.md: capability ID updates
- docs/guides/cli-usage.md: capability ID updates
- docs/guides/library-api.md: capability ID updates
- docs/guides/result-envelope.md: capability ID updates
- docs/plans/2026-02-20-atomic-chaining-followup.md: removed issue.comments.create row

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(engine): queryMock assertion, 'in' operator for alias key, derive retryable from error code

Task 1: Add queryMock.toHaveBeenCalledTimes(2) assertion in mixed-resolution chain test
Task 7: Use 'in' operator to safely detect missing mutation result alias keys
Task 8: Add isRetryableCode() helper and derive retryable from error code in Phase 1 and Phase 2 catch blocks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(gql): replace fragment regex with brace-depth counting in batch.ts

The previous `[^}]*` regex for fragment extraction truncated bodies at
the first `}`, breaking nested selections like `labels { nodes { id } }`.
Switch to the same brace-depth counting approach already used for the
operation body extraction above it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chain): resolve GQL endpoint from env vars; broaden ChainResultEnvelope type; null-guard in operator

- chain.ts: replace hardcoded GITHUB_GRAPHQL_ENDPOINT with resolveGraphqlUrl()
  from transport.ts so GITHUB_GRAPHQL_URL and GH_HOST are respected
- transport.ts: export resolveGraphqlUrl() for reuse
- envelope.ts: broaden ChainResultEnvelope.meta.route_used from "graphql" literal
  to RouteSource union (chains can route via cli/rest/graphql)
- engine.ts: forward result.meta.route_used in 1-item path; add null/non-object
  guard before `in` operator on rawMutResult
- docs: fix WeakMap → Map note in followup plan; fix sed -i '' → perl -pi cross-platform

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments