Skip to content

Bind reasoning-failure risk and receipts to agent authority #19

Description

@mdheller

Parent

SocioProphet/sociosphere#271

Purpose

Make reasoning-failure risk affect agent authority: tool grants, memory access, autonomous execution, model route eligibility, and revocation posture.

Scope

Extend agent authority semantics to consume reasoning-failure and TrustOps posture records for:

  • exactness risk;
  • stale-memory contamination;
  • unsupported causal claim risk;
  • temporal inconsistency;
  • modality contradiction;
  • social-pressure/sycophancy risk;
  • evaluator-bias risk;
  • malicious/obfuscated-agent risk;
  • premature termination or agreement without independent evidence;
  • failed perturbation robustness gates.

Authority effects

Reasoning-risk posture should support:

  • no authority change / record only;
  • require review before autonomous execution;
  • require deterministic tool verifier;
  • restrict memory writeback;
  • restrict retrieval hydration;
  • restrict side-effecting tools;
  • downgrade route eligibility;
  • quarantine agent profile;
  • revoke or suspend grant;
  • restore authority only with passing follow-up receipt and policy approval.

Acceptance criteria

  • Agent records can cite governing reasoning-failure receipt ids, TrustOps receipt ids, and guardrail action ids.
  • A high-uncertainty or failed perturbation receipt can require human approval before autonomous execution.
  • A failed exactness or modality conflict receipt can reduce tool/memory authority until verified.
  • Passing follow-up receipts can restore authority only when policy allows it.
  • No protocol-local alias can bypass canonical identity authority.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions