feat(agents): consolidate code-review agents and add orientation-first review loop#2100
feat(agents): consolidate code-review agents and add orientation-first review loop#2100WilliamBerryiii wants to merge 6 commits into
Conversation
…orchestrator - replace per-perspective code-review agents/prompts with one orchestrator plus five perspective subagents (functional, standards, accessibility, security, pr) - remove hve-core/pr-review agent and code-review-full/functional prompts - add supply-chain-reviewer agent and skill-assessor subagent - update docs, collections, plugins, and refresh ms.date to 2026-06-19 🔄 - Generated by Copilot
…onsolidation # Conflicts: # .github/CUSTOM-AGENTS.md # .github/agents/coding-standards/code-review-accessibility.agent.md # .github/agents/coding-standards/code-review-functional.agent.md # .github/agents/coding-standards/code-review-standards.agent.md # collections/coding-standards.collection.md # collections/coding-standards.collection.yml # collections/hve-core-all.collection.md # collections/hve-core-all.collection.yml # collections/hve-core.collection.md # collections/hve-core.collection.yml # collections/security.collection.md # docs/hve-guide/README.md # docs/hve-guide/roles/tech-lead.md # plugins/coding-standards/README.md # plugins/hve-core-all/README.md # plugins/hve-core/README.md # plugins/security/README.md
Reshape the code-review orchestrator around an orientation floor and human-steered walk-back loop. Add code-review-explainer and code-review-walkback subagents, convert pr-walkthrough into the code-review-pr detailer, and remove the standalone pr-walkthrough agent. Add dispatch-loop, emission-modes, cross-skill-forks, and walkthrough-protocol skill references. Register subagent folders in VS Code settings and regenerate plugins/extension outputs. Refs #2090
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2100 +/- ##
==========================================
- Coverage 80.52% 80.52% -0.01%
==========================================
Files 128 128
Lines 19260 19260
Branches 12 12
==========================================
- Hits 15510 15509 -1
- Misses 3747 3748 +1
Partials 3 3
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
… structure - standardize response field definitions in tables - improve readability of required fields for each subagent - clarify usage descriptions for report generation and verification processes 🔧 - Generated by Copilot
There was a problem hiding this comment.
The collection still references four agent files and two prompt files that this PR deletes, and is missing the new orchestrator, its subagents, and the new skill.
Dead references (files deleted by this PR):
.github/agents/coding-standards/code-review-accessibility.agent.md.github/agents/coding-standards/code-review-full.agent.md.github/agents/coding-standards/code-review-functional.agent.md.github/agents/coding-standards/code-review-standards.agent.md.github/prompts/coding-standards/code-review-full.prompt.md.github/prompts/coding-standards/code-review-functional.prompt.md
Missing entries (new files introduced by this PR):
.github/agents/coding-standards/code-review.agent.md(new orchestrator)- All seven subagents under
.github/agents/coding-standards/subagents/ .github/skills/coding-standards/code-review(new skill)
The same issue applies to collections/hve-core.collection.yml (still lists the deleted hve-core/pr-review.agent.md and hve-core/pr-walkthrough.agent.md) and collections/hve-core-all.collection.yml (inherits all of the above dead references).
npm run lint:collections-metadata was not listed in the Testing section; running it against this branch would surface all three files as validation failures. After updating the three collection YAMLs, re-run npm run plugin:generate to keep the plugin outputs in sync.
|
|
||
| imports: | ||
| - ../agents/hve-core/pr-review.agent.md | ||
| - ../agents/coding-standards/code-review.agent.md |
There was a problem hiding this comment.
This import still points to the old agent path that this PR deletes.
imports:
- ../agents/hve-core/pr-review.agent.md # file no longer exists after this PRThe agent was moved to .github/agents/pr-review.agent.md (repo root). The import should be updated to match:
imports:
- ../agents/pr-review.agent.mdAfter updating the source, run gh aw compile .github/workflows/pr-review.md to regenerate pr-review.lock.yml. The lock file currently embeds the stale path in three places (GH_AW_AGENT_FILE, GH_AW_AGENT_IMPORT_SPEC, and the {{#runtime-import}} directive), so recompilation is required. Without this fix, the /review slash command workflow will fail at runtime after merge.
Headline concern: we lose the general-purpose
|
katriendg
left a comment
There was a problem hiding this comment.
PR review feedback (after testing)
After running the consolidated code-review agent's PR-review path a few times, here is some general feedback grouped by theme. The diff/instruction-level notes are attached as inline comments on code-review.agent.md (lines 87–88), since their real subject — diff-computation.instructions.md — is unchanged in this PR and can't take inline anchors directly.
The behavior of the agent is really good from a normal flow perspective! I'm focusing here on the sense of lost workflow steps I may typically make, so my feedbackis really focused on the "human-in-the-loop" aspects of the previous pr-review flow that are no longer present in the consolidated code-review flow. The agent is doing a great job of automating the review process, but there are some areas where human judgment and control were previously integrated into the workflow that are now missing.
Also happy to take these out of this PR, and keep a post-PR follow-up where I come up with some of the changes?
What I'm losing: the human-in-the-loop workflow
The consolidation drops three pieces of the previous PR-review discipline I relied on: a human-editable draft before emission, a human gate on posting, and PR-state validation. Together these change the flow from "agent memory + human judgment, then post" to "agent memory only, then post."
No pre-post editable draft (in-progress-review lost) — design question. The old pr-review flow produced an in-progress-review.md living document plus a consolidated handoff.md that a human could edit before anything went out. The new code-review flow persists diff-state.json, dispatch-manifest.json, per-perspective *-findings.json, review.md, and metadata.json — which is good for memory hygiene — but none of these is a human-editable draft staged before emission. review.md is written in Step 7.6, but emission isn't gated on the human editing it. For reviewers who like to hand-tune comment wording before posting, that staging step is structurally gone. Could we add an explicit "draft for review" artifact (or treat review.md as a stop-and-edit gate) before emission?
Here's a bit of context: in my own workflow today, I like to use the handoff.md file in .copilot-tracking/pr/review/**/ as a staging area to hand-edit the findings before posting. The new flow doesn't have a human-editable draft before emission, so that step is gone. It would be nice to have a draft artifact that the human can edit before the agent posts comments. It's all about combining the agent's memory with the human's judgment before anything goes out. The current flow is more like "agent memory only, then post," which is a change in workflow. After testing the PR review function of the code-review agent a few times today, one thing I noticed is that handoff.md/memory is no longer that optimal. One of the examples. it builds a ready to submit review in a /tmp/review.py script, not editable, just for the agent to work with and post using gh. And then after posting it asks Nothing was committed. Want me to clean up the temp payload files in tmp?, which means it knows about temp files. Maybe it's mental but I miss the previous flow...
Emission has no human gate; it posts when a poster is available — blocking. The agent describes itself as "human-gated," but every human gate (Steps 2, 3, 5: orientation, perspective/depth, walk-back) sits before the merge. Emission is not gated on the human at all: emission-modes.md → Gating rules gate only on poster capability ("Detect the available poster capability before emission… Only emit in a native format when the target and capability are both available"), and Step 7.4 says "Prefer native PR/MR/ADO/GitHub comments when a suitable poster is available." So once the findings merge and a poster (for example, an authenticated gh) is present, the agent posts to the PR with no confirmation step. Posting a PR comment is an irreversible, shared-system action; in practice this means a user who asks the agent to "draft a handoff before posting" can still get comments posted without approval. Recommend an explicit human confirmation gate before emission (or a draft-only / dry-run default that requires an opt-in to post), and align the "human-gated" self-description with that behavior.
Lost PR validation (mergeable state, checkboxes, issue alignment). The PR review request no longer seems to validate Pull request (search for it in repo, or even align to the associated issue without requesting). It does not seem to validate all checkboxes are done as expected, and validate mergeable state. The previous PR review did this.
Recovering a discoverable PR-review entrypoint
Keep a /pr-review prompt as the PR-review entrypoint — suggestion. Consider shipping a thin /pr-review prompt that routes to the code-review agent with PR-framed pre-fill (PR/base inputs, a default perspective set such as pr + functional/security, and a sensible depth). It would recover a discoverable PR-review entrypoint and it's the natural place to encode the "draft first, then post on my go-ahead" discipline that the agent itself doesn't enforce (see the emission-gate note above). There is no code-review/pr-review prompt today (no .github/prompts/coding-standards/ dir; the old pr-review was an agent, not a prompt), so this is net-new. Note the "switch to the right agent" behavior is host-dependent — in CLI/plugin a prompt maps to a command — so phrase it to invoke the agent with PR framing rather than assuming an agent-switch UI.
| ### Step 1: Tier 0 Context Bootstrap | ||
|
|
||
| 1. Read the Skill Reference Contract files (above) in one parallel block. | ||
| 2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions. |
There was a problem hiding this comment.
Restore the explicit diff output path (non-blocking, recommended). The old pr-review agent generated the reference with an explicit --output "{{tracking_dir}}/pr-reference.xml", while the new flow (diff-computation.instructions.md → Feature Branch Diff) runs generate.sh --base-branch auto --merge-base ... with no --output and then hardcodes diffPatchPath: .copilot-tracking/pr/pr-reference.xml in diff-state.json. It works today only because that path happens to equal the skill's default output. I'd suggest passing --output explicitly (and deriving diffPatchPath from it) so the value is set in one place. That keeps the path overridable for downstream users and automation hosts without silently desyncing the generator output from what the perspective subagents read.
(Anchored here on the import site; the underlying file diff-computation.instructions.md is unchanged in this PR.)
| ### Step 1: Tier 0 Context Bootstrap | ||
|
|
||
| 1. Read the Skill Reference Contract files (above) in one parallel block. | ||
| 2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions. |
There was a problem hiding this comment.
Diff guidance leans bash-only (non-blocking). diff-computation.instructions.md shows only the bash script names (generate.sh, list-changed-files.sh, read-diff.sh) and bash-only fallback commands, even though the pr-reference skill ships PowerShell equivalents (generate.ps1 -OutputPath, list-changed-files.ps1, read-diff.ps1). On Windows hosts this reads as the only option. Either add the PowerShell variants alongside the bash ones, or have the agent/instructions point the reader to the pr-reference skill for the PowerShell counterpart (the agent's own Step 4 already provides both shells for the folder cleanup, so the instructions are the odd one out).
(Anchored here on the import site; the underlying file diff-computation.instructions.md is unchanged in this PR.)
|
|
||
| 1. Read the Skill Reference Contract files (above) in one parallel block. | ||
| 2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions. | ||
| 3. Apply the working-tree supplement from the Feature Branch Diff case in diff-computation.instructions.md to capture untracked, unstaged, and staged files. Merge surviving paths into the changed-file list, deduplicating against the committed diff. |
There was a problem hiding this comment.
The applyTo glob never fires in plugin/extension distributions (validate). diff-computation.instructions.md declares applyTo: "**/.github/agents/coding-standards/**, **/.github/prompts/coding-standards/**", but that auto-attach can only match in-repo. In the CLI plugin and VS Code extension distributions the .github/ segment is stripped (agents become agents/coding-standards/..., prompts commands/coding-standards/...), so the glob matches nothing and the instruction never auto-attaches there. The prompts arm also matches nothing today — there are no coding-standards prompts. The reason the workflow still works in distribution is that code-review.agent.md explicitly #file:-imports this file (these lines, and again in Step 7), and the file is bundled in the coding-standards collection. So the guidance is delivered by the #file: import, not by applyTo. Two follow-ups worth confirming: (1) the applyTo is misleading and could be narrowed or dropped to avoid implying auto-attach coverage that doesn't exist in distribution; (2) any future coding-standards agent or prompt that relies on auto-attach rather than an explicit #file: import would silently lose this guidance once distributed — so the convention should be "import it, don't rely on applyTo."
Description
This PR consolidates the code-review agent surface into a single human-gated orchestrator and layers an orientation-first ("Scenario D") review loop on top of it.
Code-review agent consolidation. The four standalone code-review agents (
code-review-full,code-review-functional,code-review-standards,code-review-accessibility) were retired as user-facing entrypoints and re-homed undersubagents/. A single orchestrator,.github/agents/coding-standards/code-review.agent.md, became the lone human-facing entrypoint. It dispatches the perspective subagents (functional, standards, security, accessibility) and gates all findings behind human review. The code-review skill was reorganized intoSKILL.mdplus areferences/set (context-bootstrap,depth-tiers,lens-checklists,output-formats,severity-taxonomy). The standalonepr-reviewagent was removed; a thin/pr-reviewprompt (.github/prompts/hve-core/pr-review.prompt.md) was added to the flagshiphve-corecollection as the PR-review entrypoint, routing to the consolidatedcode-revieworchestrator. The orchestrator, its subagents, the two code-review instructions, and the code-review skill remain physically homed undercoding-standardsand are surfaced through thehve-core,coding-standards, andhve-core-allcollections. A new securitysupply-chain-revieweragent plus itssupply-chain-skill-assessorsubagent were added.Orientation-first review loop. The orchestrator was reshaped into a three-stage loop: an orientation floor (Register 1, factual walkthrough), a dispatch board (bookmark → dispatch → walk-back), and a human-steered Register 2 deep-research phase. Two new subagents support the loop:
code-review-explainer(Register 1 factual symbol/function answers) andcode-review-walkback(Register 2 deep-research wrapper over the Researcher Subagent). The standalonepr-walkthroughagent was removed and its role absorbed intocode-review-practing as the orientation detailer.Persistence contract. The
review-artifacts.instructions.mdfile was extended with the orientation-first artifact set (diff-state.json,dispatch-manifest.json,dispatch-board.md,walkthrough.md,emission-record.json,explanations/,walkback/, and anartifactsmetadata object), reusing the existingdiff-state.jsonconventions rather than introducing parallel state.Configuration, collections, packaging, and docs.
.vscode/settings.jsonregistered thesubagents/folders underchat.agentFilesLocationsfor discovery. Thecoding-standards,hve-core,hve-core-all, andsecuritycollections were updated, andplugins/**andextension/**outputs were regenerated. Documentation acrossdocs/agents/**,docs/contributing/,docs/architecture/,docs/hve-guide/**, and related files was updated, and the cspell word lists gainedASEC,hotspot, andwalkback.Related Issue(s)
Closes #2070
Closes #2071
Closes #2090
Type of Change
Select all that apply:
Code & Documentation:
Infrastructure & Configuration:
AI Artifacts:
prompt-builderagent and addressed all feedback.github/instructions/*.instructions.md).github/prompts/*.prompt.md).github/agents/*.agent.md).github/skills/*/SKILL.md)Other:
.ps1,.sh,.py)Testing
Validation was run locally against the changed files:
npm run plugin:generateΓÇö 13 plugins regenerated (Agents 147, Commands 152, Instructions 179, Skills 88), 0 errors.npm run lint:mdΓÇö 235 files, 0 errors.npm run lint:frontmatterΓÇö 715 files, 0 errors.npm run validate:skillsΓÇö 37 skills, 0 errors.npm run spell-checkΓÇö 901 files, 0 issues.npm run lint:psΓÇö no PowerShell files in the diff.Checklist
Required Checks
AI Artifact Contributions
/prompt-analyzeto review contributionprompt-builderreviewRequired Automated Checks
The following validation commands must pass before merging:
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generatenpm run docs:testSecurity Considerations
Additional Notes
code-review-full,code-review-functional,code-review-standards,code-review-accessibility) and the standalonepr-walkthroughagent, replacing them with a singlecode-revieworchestrator. It is flagged as a breaking change for that reason.plugins/**symlinks/manifests andextension/**READMEs, not hand-authored logic.npm run lint:md-linksreports failures across 14 files, but these are pre-existing repo-wide link issues unrelated to this PR (git diff --name-only origin/main...HEADdoes not include any of the affected files).npm run docs:testhas not been run locally and should be validated in CI.