Skip to content

feat(agents): consolidate code-review agents and add orientation-first review loop#2100

Open
WilliamBerryiii wants to merge 6 commits into
mainfrom
feature/code-review-consolidation
Open

feat(agents): consolidate code-review agents and add orientation-first review loop#2100
WilliamBerryiii wants to merge 6 commits into
mainfrom
feature/code-review-consolidation

Conversation

@WilliamBerryiii

@WilliamBerryiii WilliamBerryiii commented Jun 21, 2026

Copy link
Copy Markdown
Member

Description

This PR consolidates the code-review agent surface into a single human-gated orchestrator and layers an orientation-first ("Scenario D") review loop on top of it.

Code-review agent consolidation. The four standalone code-review agents (code-review-full, code-review-functional, code-review-standards, code-review-accessibility) were retired as user-facing entrypoints and re-homed under subagents/. A single orchestrator, .github/agents/coding-standards/code-review.agent.md, became the lone human-facing entrypoint. It dispatches the perspective subagents (functional, standards, security, accessibility) and gates all findings behind human review. The code-review skill was reorganized into SKILL.md plus a references/ set (context-bootstrap, depth-tiers, lens-checklists, output-formats, severity-taxonomy). The standalone pr-review agent was removed; a thin /pr-review prompt (.github/prompts/hve-core/pr-review.prompt.md) was added to the flagship hve-core collection as the PR-review entrypoint, routing to the consolidated code-review orchestrator. The orchestrator, its subagents, the two code-review instructions, and the code-review skill remain physically homed under coding-standards and are surfaced through the hve-core, coding-standards, and hve-core-all collections. A new security supply-chain-reviewer agent plus its supply-chain-skill-assessor subagent were added.

Orientation-first review loop. The orchestrator was reshaped into a three-stage loop: an orientation floor (Register 1, factual walkthrough), a dispatch board (bookmark → dispatch → walk-back), and a human-steered Register 2 deep-research phase. Two new subagents support the loop: code-review-explainer (Register 1 factual symbol/function answers) and code-review-walkback (Register 2 deep-research wrapper over the Researcher Subagent). The standalone pr-walkthrough agent was removed and its role absorbed into code-review-pr acting as the orientation detailer.

Persistence contract. The review-artifacts.instructions.md file was extended with the orientation-first artifact set (diff-state.json, dispatch-manifest.json, dispatch-board.md, walkthrough.md, emission-record.json, explanations/, walkback/, and an artifacts metadata object), reusing the existing diff-state.json conventions rather than introducing parallel state.

Configuration, collections, packaging, and docs. .vscode/settings.json registered the subagents/ folders under chat.agentFilesLocations for discovery. The coding-standards, hve-core, hve-core-all, and security collections were updated, and plugins/** and extension/** outputs were regenerated. Documentation across docs/agents/**, docs/contributing/, docs/architecture/, docs/hve-guide/**, and related files was updated, and the cspell word lists gained ASEC, hotspot, and walkback.

Related Issue(s)

Closes #2070
Closes #2071
Closes #2090

Type of Change

Select all that apply:

Code & Documentation:

  • Bug fix (non-breaking change fixing an issue)
  • New feature (non-breaking change adding functionality)
  • Breaking change (fix or feature causing existing functionality to change)
  • Documentation update

Infrastructure & Configuration:

  • GitHub Actions workflow
  • Linting configuration (markdown, PowerShell, etc.)
  • Security configuration
  • DevContainer configuration
  • Dependency update

AI Artifacts:

  • Reviewed contribution with prompt-builder agent and addressed all feedback
  • Copilot instructions (.github/instructions/*.instructions.md)
  • Copilot prompt (.github/prompts/*.prompt.md)
  • Copilot agent (.github/agents/*.agent.md)
  • Copilot skill (.github/skills/*/SKILL.md)

Note for AI Artifact Contributors:

  • Agents: Research, indexing/referencing other project (using standard VS Code GitHub Copilot/MCP tools), planning, and general implementation agents likely already exist. Review .github/agents/ before creating new ones.
  • Skills: Must include both bash and PowerShell scripts. See Skills.
  • Model Versions: Only contributions targeting the latest Anthropic and OpenAI models will be accepted. Older model versions (e.g., GPT-3.5, Claude 3) will be rejected.
  • See Agents Not Accepted and Model Version Requirements.

Other:

  • Script/automation (.ps1, .sh, .py)
  • Other (please describe):

Testing

Validation was run locally against the changed files:

  • npm run plugin:generate ΓÇö 13 plugins regenerated (Agents 147, Commands 152, Instructions 179, Skills 88), 0 errors.
  • npm run lint:md ΓÇö 235 files, 0 errors.
  • npm run lint:frontmatter ΓÇö 715 files, 0 errors.
  • npm run validate:skills ΓÇö 37 skills, 0 errors.
  • npm run spell-check ΓÇö 901 files, 0 issues.
  • npm run lint:ps ΓÇö no PowerShell files in the diff.

Checklist

Required Checks

  • Documentation is updated (if applicable)
  • Files follow existing naming conventions
  • Changes are backwards compatible (if applicable)
  • Tests added for new functionality (if applicable)

AI Artifact Contributions

  • Used /prompt-analyze to review contribution
  • Addressed all feedback from prompt-builder review
  • Verified contribution follows common standards and type-specific requirements

Required Automated Checks

The following validation commands must pass before merging:

  • Markdown linting: npm run lint:md
  • Spell checking: npm run spell-check
  • Frontmatter validation: npm run lint:frontmatter
  • Skill structure validation: npm run validate:skills
  • Link validation: npm run lint:md-links
  • PowerShell analysis: npm run lint:ps
  • Plugin freshness: npm run plugin:generate
  • Docusaurus tests: npm run docs:test

Security Considerations

  • This PR does not contain any sensitive or NDA information
  • Any new dependencies have been reviewed for security issues
  • Security-related scripts follow the principle of least privilege

Additional Notes

  • This PR removes user-facing code-review entrypoints (code-review-full, code-review-functional, code-review-standards, code-review-accessibility) and the standalone pr-walkthrough agent, replacing them with a single code-review orchestrator. It is flagged as a breaking change for that reason.
  • Most of the large diff is regenerated plugins/** symlinks/manifests and extension/** READMEs, not hand-authored logic.
  • npm run lint:md-links reports failures across 14 files, but these are pre-existing repo-wide link issues unrelated to this PR (git diff --name-only origin/main...HEAD does not include any of the affected files).
  • npm run docs:test has not been run locally and should be validated in CI.

…orchestrator

- replace per-perspective code-review agents/prompts with one orchestrator plus five perspective subagents (functional, standards, accessibility, security, pr)
- remove hve-core/pr-review agent and code-review-full/functional prompts
- add supply-chain-reviewer agent and skill-assessor subagent
- update docs, collections, plugins, and refresh ms.date to 2026-06-19

🔄 - Generated by Copilot
…onsolidation

# Conflicts:
#	.github/CUSTOM-AGENTS.md
#	.github/agents/coding-standards/code-review-accessibility.agent.md
#	.github/agents/coding-standards/code-review-functional.agent.md
#	.github/agents/coding-standards/code-review-standards.agent.md
#	collections/coding-standards.collection.md
#	collections/coding-standards.collection.yml
#	collections/hve-core-all.collection.md
#	collections/hve-core-all.collection.yml
#	collections/hve-core.collection.md
#	collections/hve-core.collection.yml
#	collections/security.collection.md
#	docs/hve-guide/README.md
#	docs/hve-guide/roles/tech-lead.md
#	plugins/coding-standards/README.md
#	plugins/hve-core-all/README.md
#	plugins/hve-core/README.md
#	plugins/security/README.md
Reshape the code-review orchestrator around an orientation floor and human-steered walk-back loop. Add code-review-explainer and code-review-walkback subagents, convert pr-walkthrough into the code-review-pr detailer, and remove the standalone pr-walkthrough agent. Add dispatch-loop, emission-modes, cross-skill-forks, and walkthrough-protocol skill references. Register subagent folders in VS Code settings and regenerate plugins/extension outputs.

Refs #2090
@WilliamBerryiii WilliamBerryiii requested a review from a team as a code owner June 21, 2026 16:30
@codecov-commenter

codecov-commenter commented Jun 21, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.52%. Comparing base (b69e34a) to head (0041303).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2100      +/-   ##
==========================================
- Coverage   80.52%   80.52%   -0.01%     
==========================================
  Files         128      128              
  Lines       19260    19260              
  Branches       12       12              
==========================================
- Hits        15510    15509       -1     
- Misses       3747     3748       +1     
  Partials        3        3              
Flag Coverage Δ
docusaurus 61.84% <ø> (ø)
pester 84.23% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… structure

- standardize response field definitions in tables
- improve readability of required fields for each subagent
- clarify usage descriptions for report generation and verification processes

🔧 - Generated by Copilot

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collection still references four agent files and two prompt files that this PR deletes, and is missing the new orchestrator, its subagents, and the new skill.

Dead references (files deleted by this PR):

  • .github/agents/coding-standards/code-review-accessibility.agent.md
  • .github/agents/coding-standards/code-review-full.agent.md
  • .github/agents/coding-standards/code-review-functional.agent.md
  • .github/agents/coding-standards/code-review-standards.agent.md
  • .github/prompts/coding-standards/code-review-full.prompt.md
  • .github/prompts/coding-standards/code-review-functional.prompt.md

Missing entries (new files introduced by this PR):

  • .github/agents/coding-standards/code-review.agent.md (new orchestrator)
  • All seven subagents under .github/agents/coding-standards/subagents/
  • .github/skills/coding-standards/code-review (new skill)

The same issue applies to collections/hve-core.collection.yml (still lists the deleted hve-core/pr-review.agent.md and hve-core/pr-walkthrough.agent.md) and collections/hve-core-all.collection.yml (inherits all of the above dead references).

npm run lint:collections-metadata was not listed in the Testing section; running it against this branch would surface all three files as validation failures. After updating the three collection YAMLs, re-run npm run plugin:generate to keep the plugin outputs in sync.


imports:
- ../agents/hve-core/pr-review.agent.md
- ../agents/coding-standards/code-review.agent.md

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import still points to the old agent path that this PR deletes.

imports:
  - ../agents/hve-core/pr-review.agent.md   # file no longer exists after this PR

The agent was moved to .github/agents/pr-review.agent.md (repo root). The import should be updated to match:

imports:
  - ../agents/pr-review.agent.md

After updating the source, run gh aw compile .github/workflows/pr-review.md to regenerate pr-review.lock.yml. The lock file currently embeds the stale path in three places (GH_AW_AGENT_FILE, GH_AW_AGENT_IMPORT_SPEC, and the {{#runtime-import}} directive), so recompilation is required. Without this fix, the /review slash command workflow will fail at runtime after merge.

@katriendg

Copy link
Copy Markdown
Contributor

Headline concern: we lose the general-purpose pr-review agent and do not say so

1. The PR description is factually wrong about where pr-review went

The PR body states:

The pr-review agent moved to the repo root as a repo-specific agent...

This is not true on this branch. pr-review.agent.md is deleted and not re-created anywhere:

  • .github/agents/hve-core/pr-review.agent.md — deleted (diff shows deleted file mode 100644, content → /dev/null).
  • No .github/agents/pr-review.agent.md at the repo root. Root .github/agents/ contains only agentic-workflows, dependency-reviewer, doc-update-checker, issue-triage.
  • Full-tree search (find ... -name '*.agent.md' | grep pr-review) and git ls-tree -r HEAD | grep pr-review both return nothing.

So the claimed relocation did not happen. Either the move was dropped from the final commit, or the intent changed to a full removal and the description was not updated. This needs to be reconciled before merge.

2. pr-review was broader than code review — that scope is now unowned

pr-review (name: PR Review) was a general-purpose, collaborative, multi-turn reviewer. Its mindset and dimensions explicitly cover non-code artifacts: "Documentation and operations," "READMEs, runbooks, migration guides, API references," product requirements, acceptance criteria, and edge-case expectations. PMs and architects used it on docs, specs, PRDs/BRDs — not just code.

Its replacement, code-review (.github/agents/coding-standards/code-review.agent.md), is a code-diff orchestrator: every perspective (functional, standards, accessibility, security, pr) is anchored to code changes and diff hotspots (auth, crypto, parsing, deserialization). The pr lens is "PR-level summary, scope hygiene, validation evidence" — still code-framed. There is no successor for general (non-code) PR review.

3. The orientation-first loop fits pr-walkthrough, not the general pr-review job

The new orientation-first / dispatch-board / walk-back loop is an excellent evolution of PR Walkthrough (narrative orientation, design-fork surfacing) which was always the narrower, code-shaped role. Folding the broad, general-purpose pr-review into that same code-centric loop changes its job rather than preserving it. The consolidation reads as "walkthrough + code review merged," with general PR review quietly dropped.

4. Flagship hve-core collection now has no PR/code review entrypoint at all

  • collections/hve-core.collection.yml removes both pr-review and pr-walkthrough.
  • code-review is added only to coding-standards and hve-core-allnot to flagship hve-core.
  • A user who installs only the flagship collection loses all PR/code review; task-reviewer covers plan-vs-implementation only.

This is the most consequential structural change and it is not called out in the PR description or docs.


Secondary findings

  • Stale doc referencedocs/getting-started/methods/cli-plugins.md line 167 still lists PR Review - pull request analysis and feedback under the hve-core plugin agents. The agent is gone, and its successor lives in the coding-standards plugin, not hve-core. The line should be removed (not renamed).
  • Overstated paritydocs/hve-guide/lifecycle/review.md swaps pr-reviewcode-review while keeping the purpose "Evaluate pull requests for quality" and a general PR-review example. This implies a 1:1 replacement the new agent does not provide.
  • Migration otherwise clean — no orphaned docs/agents/pr-review/ or docs/agents/pr-walkthrough/ folders, workflow import updated, TRANSPARENCY-NOTE.md, CUSTOM-AGENTS.md, and other collections consistent.
  • Agentic quality is goodcode-review is prompt-first, human-gated, thin (delegates to the code-review skill + references), with clear steps, autonomy modes, and error recovery. Path style (.github/skills/..., .github/agents/**/, #file:../../instructions/...) matches repo convention and the hve-core-location fallback; not a defect.

@katriendg katriendg left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR review feedback (after testing)

After running the consolidated code-review agent's PR-review path a few times, here is some general feedback grouped by theme. The diff/instruction-level notes are attached as inline comments on code-review.agent.md (lines 87–88), since their real subject — diff-computation.instructions.md — is unchanged in this PR and can't take inline anchors directly.

The behavior of the agent is really good from a normal flow perspective! I'm focusing here on the sense of lost workflow steps I may typically make, so my feedbackis really focused on the "human-in-the-loop" aspects of the previous pr-review flow that are no longer present in the consolidated code-review flow. The agent is doing a great job of automating the review process, but there are some areas where human judgment and control were previously integrated into the workflow that are now missing.

Also happy to take these out of this PR, and keep a post-PR follow-up where I come up with some of the changes?


What I'm losing: the human-in-the-loop workflow

The consolidation drops three pieces of the previous PR-review discipline I relied on: a human-editable draft before emission, a human gate on posting, and PR-state validation. Together these change the flow from "agent memory + human judgment, then post" to "agent memory only, then post."

No pre-post editable draft (in-progress-review lost) — design question. The old pr-review flow produced an in-progress-review.md living document plus a consolidated handoff.md that a human could edit before anything went out. The new code-review flow persists diff-state.json, dispatch-manifest.json, per-perspective *-findings.json, review.md, and metadata.json — which is good for memory hygiene — but none of these is a human-editable draft staged before emission. review.md is written in Step 7.6, but emission isn't gated on the human editing it. For reviewers who like to hand-tune comment wording before posting, that staging step is structurally gone. Could we add an explicit "draft for review" artifact (or treat review.md as a stop-and-edit gate) before emission?

Here's a bit of context: in my own workflow today, I like to use the handoff.md file in .copilot-tracking/pr/review/**/ as a staging area to hand-edit the findings before posting. The new flow doesn't have a human-editable draft before emission, so that step is gone. It would be nice to have a draft artifact that the human can edit before the agent posts comments. It's all about combining the agent's memory with the human's judgment before anything goes out. The current flow is more like "agent memory only, then post," which is a change in workflow. After testing the PR review function of the code-review agent a few times today, one thing I noticed is that handoff.md/memory is no longer that optimal. One of the examples. it builds a ready to submit review in a /tmp/review.py script, not editable, just for the agent to work with and post using gh. And then after posting it asks Nothing was committed. Want me to clean up the temp payload files in tmp?, which means it knows about temp files. Maybe it's mental but I miss the previous flow...

Emission has no human gate; it posts when a poster is available — blocking. The agent describes itself as "human-gated," but every human gate (Steps 2, 3, 5: orientation, perspective/depth, walk-back) sits before the merge. Emission is not gated on the human at all: emission-modes.md → Gating rules gate only on poster capability ("Detect the available poster capability before emission… Only emit in a native format when the target and capability are both available"), and Step 7.4 says "Prefer native PR/MR/ADO/GitHub comments when a suitable poster is available." So once the findings merge and a poster (for example, an authenticated gh) is present, the agent posts to the PR with no confirmation step. Posting a PR comment is an irreversible, shared-system action; in practice this means a user who asks the agent to "draft a handoff before posting" can still get comments posted without approval. Recommend an explicit human confirmation gate before emission (or a draft-only / dry-run default that requires an opt-in to post), and align the "human-gated" self-description with that behavior.

Lost PR validation (mergeable state, checkboxes, issue alignment). The PR review request no longer seems to validate Pull request (search for it in repo, or even align to the associated issue without requesting). It does not seem to validate all checkboxes are done as expected, and validate mergeable state. The previous PR review did this.


Recovering a discoverable PR-review entrypoint

Keep a /pr-review prompt as the PR-review entrypoint — suggestion. Consider shipping a thin /pr-review prompt that routes to the code-review agent with PR-framed pre-fill (PR/base inputs, a default perspective set such as pr + functional/security, and a sensible depth). It would recover a discoverable PR-review entrypoint and it's the natural place to encode the "draft first, then post on my go-ahead" discipline that the agent itself doesn't enforce (see the emission-gate note above). There is no code-review/pr-review prompt today (no .github/prompts/coding-standards/ dir; the old pr-review was an agent, not a prompt), so this is net-new. Note the "switch to the right agent" behavior is host-dependent — in CLI/plugin a prompt maps to a command — so phrase it to invoke the agent with PR framing rather than assuming an agent-switch UI.

### Step 1: Tier 0 Context Bootstrap

1. Read the Skill Reference Contract files (above) in one parallel block.
2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore the explicit diff output path (non-blocking, recommended). The old pr-review agent generated the reference with an explicit --output "{{tracking_dir}}/pr-reference.xml", while the new flow (diff-computation.instructions.md → Feature Branch Diff) runs generate.sh --base-branch auto --merge-base ... with no --output and then hardcodes diffPatchPath: .copilot-tracking/pr/pr-reference.xml in diff-state.json. It works today only because that path happens to equal the skill's default output. I'd suggest passing --output explicitly (and deriving diffPatchPath from it) so the value is set in one place. That keeps the path overridable for downstream users and automation hosts without silently desyncing the generator output from what the perspective subagents read.

(Anchored here on the import site; the underlying file diff-computation.instructions.md is unchanged in this PR.)

### Step 1: Tier 0 Context Bootstrap

1. Read the Skill Reference Contract files (above) in one parallel block.
2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diff guidance leans bash-only (non-blocking). diff-computation.instructions.md shows only the bash script names (generate.sh, list-changed-files.sh, read-diff.sh) and bash-only fallback commands, even though the pr-reference skill ships PowerShell equivalents (generate.ps1 -OutputPath, list-changed-files.ps1, read-diff.ps1). On Windows hosts this reads as the only option. Either add the PowerShell variants alongside the bash ones, or have the agent/instructions point the reader to the pr-reference skill for the PowerShell counterpart (the agent's own Step 4 already provides both shells for the folder cleanup, so the instructions are the odd one out).

(Anchored here on the import site; the underlying file diff-computation.instructions.md is unchanged in this PR.)


1. Read the Skill Reference Contract files (above) in one parallel block.
2. Compute the diff once. Use the Decision Tree in #file:../../instructions/coding-standards/code-review/diff-computation.instructions.md to determine the diff type, then generate the structured diff via the `pr-reference` skill (`generate.sh --base-branch auto --merge-base --exclude-ext min.js,min.css,map`) and the changed-file list (`list-changed-files.sh --exclude-type deleted --format plain`). Apply the Non-Source Artifact Skip List and Large Diff Handling rules. Capture the base branch, branch name, changed-file surface, and extensions.
3. Apply the working-tree supplement from the Feature Branch Diff case in diff-computation.instructions.md to capture untracked, unstaged, and staged files. Merge surviving paths into the changed-file list, deduplicating against the committed diff.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The applyTo glob never fires in plugin/extension distributions (validate). diff-computation.instructions.md declares applyTo: "**/.github/agents/coding-standards/**, **/.github/prompts/coding-standards/**", but that auto-attach can only match in-repo. In the CLI plugin and VS Code extension distributions the .github/ segment is stripped (agents become agents/coding-standards/..., prompts commands/coding-standards/...), so the glob matches nothing and the instruction never auto-attaches there. The prompts arm also matches nothing today — there are no coding-standards prompts. The reason the workflow still works in distribution is that code-review.agent.md explicitly #file:-imports this file (these lines, and again in Step 7), and the file is bundled in the coding-standards collection. So the guidance is delivered by the #file: import, not by applyTo. Two follow-ups worth confirming: (1) the applyTo is misleading and could be narrowed or dropped to avoid implying auto-attach coverage that doesn't exist in distribution; (2) any future coding-standards agent or prompt that relies on auto-attach rather than an explicit #file: import would silently lose this guidance once distributed — so the convention should be "import it, don't rely on applyTo."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants