fix(agent-os): harden replay, tool-poisoning, audit accounting, health, signed CloudEvents by liamcrumm · Pull Request #3247 · microsoft/agent-governance-toolkit

liamcrumm · 2026-07-02T22:32:57Z

Summary

Close five agent_os security and integrity gaps so the kernel can be trusted to gate and record untrusted activity. Every defect ships a test that fails before the change and passes after, and each evidence repro flips.

Problem

agent_os had a cluster of governance holes: replay protection that expired early, a memory guard blind to markup-wrapped tool poisoning, an audit event processor that lost events without counting them, a health probe that reported HEALTHY while verifying nothing, and audit events that were plain unsigned dicts rather than signed CloudEvents.

Changes

File	What changed
`mcp_protocols.py`	`InMemoryNonceStore.add` evicts expired-first and raises `NonceStoreCapacityError` instead of evicting an in-window nonce; new error type; docstrings updated
`mcp_message_signer.py`	`verify_message` surfaces capacity saturation as a fail-closed verification failure; cache-size docstring corrected
`memory_guard.py`	New markup-instruction, destructive-command and exfil pattern sets plus `AlertType.TOOL_POISONING`; MEDIUM alone, HIGH when paired
`event_sink.py`	Lock-guarded `submitted`/`delivered`/`failed` counters (reconcile with `dropped`); `GovernanceEvent.to_cloudevent()`; `GovernanceEventSigner`; `StdoutGovernanceSink` and `OTLPGovernanceSink`
`integrations/health.py`	Auto-register built-ins (`register_builtins` flag), empty report aggregates to UNHEALTHY, audit check reports DEGRADED with no backend
`__init__.py`	Export `NonceStoreCapacityError`, `GovernanceEventSigner`, `StdoutGovernanceSink`, `OTLPGovernanceSink`
tests (6 files)	New regression tests per defect; rewrote tests that encoded the vulnerable behavior

Defects and repro flips

Nonce-cache replay count-eviction dropped in-window nonces, so a replay verified valid again. Now fail-closed at capacity. cache=2, verify 4 msgs, re-verify msg0 -> is_valid=False.
Tool-poisoning gap <important>...rm -rf / and curl evil.com</important> was allowed=True, 0 alerts. Now allowed=False with a HIGH TOOL_POISONING alert; benign prose stays allowed.
Silent audit loss events dispatched to a FAILURE or circuit-open sink were drained but counted nowhere. Now submitted == delivered + failed + dropped; a FAILURE sink yields failed=50, not silent loss.
False-healthy probe HealthChecker().check_health() was empty and HEALTHY. Now built-ins auto-register, empty aggregates UNHEALTHY, and the audit check is DEGRADED without a backend.
Unsigned audit to_dict() lacked id/source/type/specversion and a signature. New to_cloudevent() (ADR-0021, AUDIT-COMPLIANCE-1.0 section 20.4) plus HMAC signer (signature covers the algorithm attribute); tamper of data or algorithm fails verification.

Design notes and scope

Nonce store fails closed at capacity rather than growing unbounded (a pure soft cap would make max_entries a non-bound). Availability tradeoff under saturation is documented; operators raise max_nonce_cache_size or shorten the replay window.
to_dict() is intentionally unchanged for AuditBackendSinkAdapter back-compat; the CloudEvents form is the export path per the documented method name to_cloudevent().
Health behavior change: the built-in server /health now reports degraded (audit backend not wired into the checker) instead of a false healthy. Readiness stays green because DEGRADED is still ready.
Signing key management is out of scope; sinks take a caller-provided signer. The key is never logged.

Testing

ruff check --select E,F,W --ignore E501 clean on all changed files.
Targeted pytest for the four touched suites plus consumers (test_server, test_optional_deps_integration, test_core_features, test_spec_audit_compliance_conformance) passes; +21 new tests.
All five evidence repros flip before vs after.
Full agent-os suite: the 20 pre-existing test_autogen_hooks failures and the agent_sre-missing collection error reproduce on a clean tree and are unrelated to this change.

…h, signed CloudEvents Close five agent_os security and integrity gaps so the kernel can be trusted to gate and record untrusted activity. Each defect ships with a test that fails before the change and passes after. 1. Nonce-cache replay (mcp_protocols / mcp_message_signer). InMemoryNonceStore evicted by count via OrderedDict.popitem, dropping nonces still inside the replay window and letting messages replay as valid. Eviction is now expired-first and fails closed at capacity (NonceStoreCapacityError) rather than evicting an in-window nonce, keeping max_entries a real bound. 2. Tool-poisoning coverage (memory_guard). validate_write missed markup-wrapped standing instructions carrying destructive shell commands and exfil. Added markup-tag, destructive-command and exfil pattern sets and a TOOL_POISONING alert. Tags are MEDIUM alone and HIGH when paired with a dangerous payload, so benign prose (including a bare curl mention) stays allowed. 3. Event accounting (event_sink.GovernanceEventProcessor). The processor had no delivered counter and lost events dispatched to a failing or circuit-open sink without counting them. Added lock-guarded submitted, delivered and failed counters so submitted == delivered + failed + dropped after shutdown. 4. False-healthy probe (integrations/health). A fresh HealthChecker reported HEALTHY with zero components and the audit built-in was an unconditional HEALTHY stub. Built-ins now auto-register (register_builtins flag), an empty report aggregates to UNHEALTHY, and the audit check reports DEGRADED when no backend is configured. 5. Signed CloudEvents (event_sink). GovernanceEvent gains to_cloudevent() (CloudEvents 1.0 per ADR-0021 and AUDIT-COMPLIANCE-1.0 section 20.4) plus an HMAC GovernanceEventSigner whose signature covers the algorithm attribute, and StdoutGovernanceSink and OTLPGovernanceSink. to_dict() is unchanged for AuditBackendSinkAdapter back-compat. Validation: ruff check --select E,F,W --ignore E501 clean on changed files; targeted pytest for the four touched suites plus consumers passes; all five repros flip. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: liamcrumm <liamcrumm@microsoft.com>

github-actions · 2026-07-02T22:33:11Z

PR Review Summary

Check	Status	Details
🔍 Code Review	⚠️ Missing	No current-run comment
🛡️ Security Scan	⚠️ Missing	No current-run comment
🔄 Breaking Changes	⚠️ Missing	No current-run comment
📝 Docs Sync	⚠️ Missing	No current-run comment
🧪 Test Coverage	⚠️ Missing	No current-run comment

Verdict: ⚠️ AI review incomplete; ready for human review

AI review comments are untrusted advisory output. The summary reports workflow-generated completion status only, not model-authored pass/fail claims.

github-actions · 2026-07-02T22:33:15Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

-        checker = HealthChecker(version="test")
+
+        class _AuditBackend:
+            def write(self, entry) -> None: ...


+
+        class _AuditBackend:
+            def write(self, entry) -> None: ...
+            def flush(self) -> None: ...


…, health, docs) Remediations from a multi-lens deep code review of the security hardening: - event_sink: spread GovernanceEvent.attributes BEFORE authoritative fields in to_cloudevent() so a free-form attribute key can no longer shadow the real agent_id/decision/action inside a validly-signed CloudEvent (audit integrity). - event_sink: CloudEvents type namespace changed from ai.agentmesh.* to ai.agentos.* — Agent OS is a distinct producer from Agent Mesh (ADR-0021 reserves ai.agentmesh.* for the mesh producer); stop inventing mesh-namespaced types not in AUDIT-COMPLIANCE 20.4. - event_sink: signer/sinks serialize with default=str (matches AuditEntry.to_json) so a non-JSON-native attribute value no longer aborts the whole batch; derived CloudEvents source is percent-encoded. - event_sink: accounting distinguishes intentional sink DROPPED (bucketed as dropped, not failed); OTLPGovernanceSink returns DROPPED (not SUCCESS) when OpenTelemetry is absent, and promotes severity/resource/policy_name/session_id as searchable OTel metadata alongside the CloudEvent envelope. - mcp_protocols: nonce retention is inclusive of the exact expiry instant (has/cleanup use strict >), matching the verifier's inclusive replay-window check, closing a boundary replay at now == expires_at. - memory_guard: markup tag + a BARE curl/wget is MEDIUM (allowed) not HIGH, to stop false-positive blocks of benign tooling runbooks; fork-bomb regex tolerates whitespace. Destructive/exfil pairings still block. - health: removed the unused policy_engine constructor parameter (silently-ignored input); documented that the server /health is DEGRADED until a durable audit backend is wired. - docs/spec: CHANGELOG [Unreleased] Security entry; MCP-SECURITY-GATEWAY 7.7/7.9 now specify capacity fail-closed + inclusive retention; api-reference HealthChecker signature and example updated. Tests: +new regressions for each finding; targeted + conformance suites pass (381), full agent-os suite unchanged vs baseline (4804 passed; pre-existing agent_sre/autogen env failures only). ruff clean on changed files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: liamcrumm <liamcrumm@microsoft.com>

github-actions Bot added the tests label Jul 2, 2026

github-actions Bot added the size/XL Extra large PR (500+ lines) label Jul 2, 2026

github-code-quality Bot found potential problems Jul 2, 2026

View reviewed changes

github-actions Bot added the documentation Improvements or additions to documentation label Jul 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agent-os): harden replay, tool-poisoning, audit accounting, health, signed CloudEvents#3247

fix(agent-os): harden replay, tool-poisoning, audit accounting, health, signed CloudEvents#3247
liamcrumm wants to merge 2 commits into
mainfrom
fix/agent-os-security-integrity

liamcrumm commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

liamcrumm commented Jul 2, 2026

Summary

Problem

Changes

Defects and repro flips

Design notes and scope

Testing

Uh oh!

github-actions Bot commented Jul 2, 2026

PR Review Summary

Uh oh!

github-actions Bot commented Jul 2, 2026

Dependency Review

Scanned Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant