Summary
Analysis of the last 24 hours (2026-06-02, 60 runs) found 5 schema errors, all on the submit_pull_request_review safe-output tool, across 2 reviewer workflows. The previously-filed fix is now fully deployed and live — yet the error persists and has mutated into a new variant. This is a tool-description problem (the reviewer prompts are correct), and it is not a duplicate of the closed #35579 (which added the warning) or #36000 (which fixed the deployment gap).
The key new evidence: the warning currently names a single forbidden field (pull_request_number). Agents either ignore it or route around it to the next plausible identifier (item_number). Blocking one field name without enumerating all of them just shifts the mistake.
What changed since the last report
🔍 Error analysis details
Variant 1 — pull_request_number (known field, persists despite the live warning)
Occurrences: 3 — workflow Matt Pocock Skills Reviewer (copilot)
The agent passes pull_request_number on its inner create_pull_request_review_comment payloads — which is valid for that tool — then carries the same field onto the final submit_pull_request_review payload, where it is silently stripped:
{ "pull_request_number": 36527, "event": "COMMENT", "body": "### Skills-Based Review ..." }
Variant 2 — item_number (NEW substitution, not covered by the warning)
Occurrences: 2 — workflow Test Quality Sentinel (copilot)
Actual invocation (run 26849261660):
safeoutputs submit_pull_request_review --item_number 36527 --event APPROVE --body "✅ Test Quality Sentinel: 70/100 ..."
The agent correctly avoided pull_request_number (it heeded the warning) but substituted item_number — the field used by add_comment / update_issue. item_number is not in the submit_pull_request_review schema, so it is silently stripped too. The review still auto-targets the triggering PR and posts, so no error surfaces.
Current tool description
submit_pull_request_review description (pkg/ and actions/setup/js — identical)
Submit a pull request review with a status decision. This tool auto-targets the pull request that triggered the workflow — do NOT pass pull_request_number (unlike create_pull_request_review_comment and reply_to_pull_request_review_comment, which accept it; this tool will silently strip it). REQUIRED: ...
inputSchema accepts only: body, event, secrecy, integrity (additionalProperties: false).
Root cause
- The prohibition names exactly one field (
pull_request_number). It does not tell the agent that no targeting parameter is accepted, so the agent reaches for the next plausible one (item_number).
- The prohibition is buried mid-sentence after the auto-target clause, easy to overlook —
pull_request_number still recurs even when present.
- Sibling tools (
create_pull_request_review_comment, reply_to_pull_request_review_comment, add_comment) all accept a PR/item identifier, so parity assumption is strong; agents also carry the field over from their valid review-comment payloads.
Recommended description improvements
In pkg/workflow/js/safe_outputs_tools.json (and propagate to the runtime actions/setup/js/safe_outputs_tools.json copy in the same change — that copy is what runs):
- Lead with the prohibition and generalize it to all identifiers. Suggested opening:
"Submit a pull request review. This tool ALWAYS auto-targets the pull request that triggered the workflow and accepts NO targeting parameter — do not pass pull_request_number, item_number, pr_number, or issue_number; any such field is silently stripped. Only body and event are accepted (plus optional secrecy/integrity)."
- Explicitly contrast the sibling tools so the parity assumption is addressed: name that
create_pull_request_review_comment / reply_to_pull_request_review_comment DO accept pull_request_number, but this tool does not.
- Keep the existing REQUIRED-body / inline-comment guidance.
Secondary observation (out of scope for description-only fix)
The validator silently strips the unknown field rather than returning ERR_VALIDATION. Because the review still posts, the agent never receives corrective feedback, which is why a deployed description fix has now twice failed to stop the pattern. If description changes alone do not move the needle, consider surfacing a soft validation warning when a known-stray identifier is stripped from submit_pull_request_review. (Validator change, not a description change — noted for whoever picks this up.)
Affected workflows
Matt Pocock Skills Reviewer — 3 (pull_request_number)
Test Quality Sentinel — 2 (item_number)
All other write safe-outputs in the window were schema-clean (add_comment with item_number/pr_number, create_issue, create_pull_request_review_comment with valid pull_request_number, smoke-test outputs).
Implementation checklist
References: §26849261660, §26849261648, §26844204022 · related: #35579, #36000
Generated by ⚡ Daily Safe Output Tool Optimizer · opus48 3.1M · ◷
Summary
Analysis of the last 24 hours (2026-06-02, 60 runs) found 5 schema errors, all on the
submit_pull_request_reviewsafe-output tool, across 2 reviewer workflows. The previously-filed fix is now fully deployed and live — yet the error persists and has mutated into a new variant. This is a tool-description problem (the reviewer prompts are correct), and it is not a duplicate of the closed #35579 (which added the warning) or #36000 (which fixed the deployment gap).The key new evidence: the warning currently names a single forbidden field (
pull_request_number). Agents either ignore it or route around it to the next plausible identifier (item_number). Blocking one field name without enumerating all of them just shifts the mistake.What changed since the last report
actions/setup/js/safe_outputs_tools.jsonis in sync withpkg/workflow/js/safe_outputs_tools.json.workflow-logs/3_agent.txtof the affected runs).submit_pull_request_reviewstill received a stray PR-targeting field 5 times today.🔍 Error analysis details
Variant 1 —
pull_request_number(known field, persists despite the live warning)Occurrences: 3 — workflow
Matt Pocock Skills Reviewer(copilot)The agent passes
pull_request_numberon its innercreate_pull_request_review_commentpayloads — which is valid for that tool — then carries the same field onto the finalsubmit_pull_request_reviewpayload, where it is silently stripped:{ "pull_request_number": 36527, "event": "COMMENT", "body": "### Skills-Based Review ..." }Variant 2 —
item_number(NEW substitution, not covered by the warning)Occurrences: 2 — workflow
Test Quality Sentinel(copilot)Actual invocation (run 26849261660):
The agent correctly avoided
pull_request_number(it heeded the warning) but substituteditem_number— the field used byadd_comment/update_issue.item_numberis not in thesubmit_pull_request_reviewschema, so it is silently stripped too. The review still auto-targets the triggering PR and posts, so no error surfaces.Current tool description
submit_pull_request_review description (pkg/ and actions/setup/js — identical)
inputSchema accepts only:
body,event,secrecy,integrity(additionalProperties: false).Root cause
pull_request_number). It does not tell the agent that no targeting parameter is accepted, so the agent reaches for the next plausible one (item_number).pull_request_numberstill recurs even when present.create_pull_request_review_comment,reply_to_pull_request_review_comment,add_comment) all accept a PR/item identifier, so parity assumption is strong; agents also carry the field over from their valid review-comment payloads.Recommended description improvements
In
pkg/workflow/js/safe_outputs_tools.json(and propagate to the runtimeactions/setup/js/safe_outputs_tools.jsoncopy in the same change — that copy is what runs):create_pull_request_review_comment/reply_to_pull_request_review_commentDO acceptpull_request_number, but this tool does not.Secondary observation (out of scope for description-only fix)
The validator silently strips the unknown field rather than returning
ERR_VALIDATION. Because the review still posts, the agent never receives corrective feedback, which is why a deployed description fix has now twice failed to stop the pattern. If description changes alone do not move the needle, consider surfacing a soft validation warning when a known-stray identifier is stripped fromsubmit_pull_request_review. (Validator change, not a description change — noted for whoever picks this up.)Affected workflows
Matt Pocock Skills Reviewer— 3 (pull_request_number)Test Quality Sentinel— 2 (item_number)All other write safe-outputs in the window were schema-clean (
add_commentwithitem_number/pr_number,create_issue,create_pull_request_review_commentwith validpull_request_number, smoke-test outputs).Implementation checklist
submit_pull_request_reviewprohibition inpkg/workflow/js/safe_outputs_tools.jsonactions/setup/js/safe_outputs_tools.jsonin the same PRTestSafeOutputsToolsJSONInSynccompares full description content (added per [safeoutputs] #35579 description fixes never reached the runtime tool copy — all 4 stripped-field patterns recurred today (8× su [Content truncated due to length] #36000) so the two copies cannot drift againmake build/make recompile/make testReferences: §26849261660, §26849261648, §26844204022 · related: #35579, #36000