diff --git a/index.yaml b/index.yaml index f038f59a..9640eb69 100644 --- a/index.yaml +++ b/index.yaml @@ -6,7 +6,7 @@ meta: version: "1.0.0" last_updated: "2026-03-05" - skill_count: 45 + skill_count: 46 role_count: 5 tag_vocabulary: @@ -321,6 +321,18 @@ skills: file: skills/ai-security/agent-security/SKILL.md compatible_tools: [claude-code, gemini-cli, cursor, codex-cli, openclaw, kiro] + - id: tool-authorization-drift + name: "Tool Authorization Drift Review" + tags: [ai-security, agents, authorization, tool-use] + role: [security-engineer, appsec-engineer, architect] + phase: [design, build, review, operate] + activity: [review, assess, test] + frameworks: [OWASP-Agentic-AI, OWASP-LLM06-2025, CWE] + difficulty: advanced + time_estimate: "45-90min" + file: skills/ai-security/tool-authorization-drift/SKILL.md + compatible_tools: [claude-code, gemini-cli, cursor, codex-cli, openclaw, kiro] + # -- Incident Response ---------------------------------------------------- - id: ir-playbook name: "Incident Response Playbook" diff --git a/skills/ai-security/tool-authorization-drift/README.md b/skills/ai-security/tool-authorization-drift/README.md new file mode 100644 index 00000000..6666f056 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/README.md @@ -0,0 +1,46 @@ +# Tool Authorization Drift Review + +This skill reviews agentic systems for drift between declared tool permissions +and effective runtime authorization. It focuses on policy/runtime mismatches, +preview/execute confusion, delegated calls, stale approval caches, alias mapping, +and audit evidence. + +## Included Fixtures + +Vulnerable examples: + +- `fixtures/vulnerable/preview_execute_confusion.yaml` +- `fixtures/vulnerable/python_stale_approval_cache.py` +- `fixtures/vulnerable/delegated_worker_bypass.yaml` + +Benign examples: + +- `fixtures/benign/separate_preview_execute_policy.yaml` +- `fixtures/benign/python_bound_approval_token.py` +- `fixtures/benign/delegated_worker_recheck.yaml` + +## Review Targets + +- tool policy and generated manifests; +- function calling, MCP, or plugin registration code; +- tool routers and handler resolution; +- approval token generation and cache keys; +- queue workers and delegated execution paths; +- audit records for policy and runtime decisions. + +## Validation + +Run syntax checks for the included fixtures: + +```bash +python -m py_compile fixtures/vulnerable/python_stale_approval_cache.py \ + fixtures/benign/python_bound_approval_token.py + +python - <<'PY' +import pathlib, yaml +for path in pathlib.Path("fixtures").rglob("*.yaml"): + yaml.safe_load(path.read_text()) +PY +``` + +Use `SKILL.md` for the review checklist and reporting template. diff --git a/skills/ai-security/tool-authorization-drift/SKILL.md b/skills/ai-security/tool-authorization-drift/SKILL.md new file mode 100644 index 00000000..4eb91386 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/SKILL.md @@ -0,0 +1,234 @@ +--- +name: tool-authorization-drift +description: > + Reviews agentic systems for drift between declared tool permissions and + effective runtime authorization. Covers preview/execute confusion, delegated + tool calls, approval cache scope and TTL, tool alias mapping, indirect + state-changing workflows, and audit evidence. Use when an LLM or agent can + select tools, call functions, request approvals, or pass tool results to other + executors. +tags: [ai-security, agents, authorization, tool-use, policy-drift] +role: [security-engineer, appsec-engineer, architect] +phase: [design, build, review, operate] +frameworks: [OWASP-Agentic-AI, OWASP-LLM06-2025, CWE-285] +difficulty: advanced +time_estimate: "45-90min" +version: "1.0.0" +author: unitoneai +license: MIT +allowed-tools: Read, Grep, Glob +injection-hardened: true +argument-hint: "[agent-tool-policy-or-runtime-code]" +--- + +# Tool Authorization Drift Review + +If a target is provided via arguments, focus the review on: $ARGUMENTS + +> **This skill is strictly for defensive review.** It compares tool policy, +> approval, runtime enforcement, and audit evidence in systems you own or are +> authorized to assess. Do not execute tools, approve actions, transfer funds, +> change data, or call external services while reviewing. + +## Security Outcome + +Ensure an agent cannot perform an action that its declared policy, user approval, +tenant scope, or runtime authorization should deny. + +## When to Use + +Use this skill when reviewing: + +- LLM tools, function calling, MCP connectors, plugin systems, or internal + workflow tools; +- policy files that declare tool names, scopes, roles, tenants, or risk levels; +- approval flows for tool execution; +- preview/dry-run and execute tool pairs; +- delegated calls from one agent, workflow, or service to another; +- caches, queues, retries, or background workers that execute tool requests; +- audit logs that claim to prove who authorized and executed a tool call. + +Pair it with `agent-security` for broader architecture reviews. Use this skill +when the central question is: "Does runtime enforcement exactly match the +declared tool authorization model?" + +## Review Workflow + +### 1. Build the Tool Authorization Matrix + +Create a matrix from policy, code, and logs: + +| Tool | Declared action | Runtime handler | Scope key | Approval required | Side effect | +|---|---|---|---|---|---| +| `invoice.preview` | read | preview handler | tenant, user | no | none | +| `invoice.send` | write | send handler | tenant, user, invoice | yes | sends invoice | + +Every runtime handler must map to exactly one declared capability. If handlers +are invoked through aliases, queues, routers, or dynamic dispatch, include those +paths explicitly. + +### 2. Compare Declared Policy to Runtime Enforcement + +Check whether enforcement happens at the moment of execution, not only at prompt +construction or UI display. + +Required evidence: + +- declared tool name, action, resource type, tenant, and risk level; +- code that resolves the tool handler; +- code that checks actor, target resource, tenant, approval, and scope; +- tests or logs showing denied tools are denied at runtime; +- default behavior for missing, unknown, aliased, or renamed tools. + +### 3. Review Preview/Execute Separation + +Preview tools and execute tools must be separate capabilities. A preview result +must not be accepted as an execute instruction. + +Look for: + +- shared handler for preview and execute with a weak `mode` parameter; +- preview result object forwarded to a worker that performs the action; +- cached approval for `*.preview` reused for `*.execute`; +- UI text or model output converted into an execute payload; +- tool aliases that map a safe preview name to a state-changing handler. + +### 4. Inspect Approval Tokens and Caches + +Approval must be bound to the exact action that will execute. + +Verify approval tokens include: + +- actor and requester; +- target tenant and resource; +- tool name and action; +- risk level and allowed parameters; +- delegation chain, if any; +- expiry and single-use or replay controls; +- reason, ticket, or user confirmation evidence. + +Cache keys must include the same fields. Any cache key that omits tool, action, +tenant, resource, actor, or approval version is a drift risk. + +### 5. Trace Delegation and Background Execution + +Tool authorization often drifts when an allowed tool hands work to another +system. + +Review: + +- queue payloads, background workers, retries, and schedulers; +- service-to-service calls made after the original agent request; +- delegated sub-agent policies and inherited scopes; +- webhook callbacks and async continuations; +- "safe" tools that return signed URLs, job IDs, or command payloads consumed by + privileged workers. + +Runtime workers must re-check authorization using durable context, not trust an +agent-supplied payload. + +### 6. Validate Audit Evidence + +Audit records must show both the declared authorization decision and the runtime +execution decision. + +Each event should include: + +- policy version and matched rule; +- requested tool and resolved handler; +- actor, tenant, target resource, and approval ID; +- delegation chain and background job ID; +- runtime decision, denial reason, or execution result; +- parameter digest or safe redacted parameter summary. + +## High-Signal Findings + +Report a finding when: + +- a denied tool can be reached through alias, router, queue, or delegated call; +- preview approval or preview output can trigger execute behavior; +- approval cache keys omit tool, action, tenant, actor, resource, or policy + version; +- runtime handlers trust model-generated or client-provided authorization + fields; +- background workers execute tool payloads without re-checking scope; +- policy denies a capability but logs show the handler executes; +- audit logs record requested tool names but not resolved runtime handlers; +- missing policy entries default to allow. + +## Remediation Guidance + +- Treat each tool/action pair as a separate capability. +- Enforce authorization in the runtime handler and background worker. +- Bind approvals to exact tool, action, tenant, resource, actor, policy version, + parameters, and expiry. +- Make preview outputs non-executable and structurally different from execute + payloads. +- Resolve aliases before policy checks and log both alias and canonical handler. +- Fail closed for unknown tools, missing policy, stale approvals, and mismatched + tenant/resource scope. +- Re-check delegated calls and queue workers using server-side context. +- Add tests that prove denied tools remain denied through aliases, retries, and + delegation. + +## Verification Checklist + +- [ ] every runtime handler maps to a declared capability; +- [ ] unknown or missing policy entries fail closed; +- [ ] preview and execute capabilities have separate approval requirements; +- [ ] approval tokens include tool, action, tenant, resource, actor, policy + version, and expiry; +- [ ] cache keys include the same fields as approval tokens; +- [ ] background workers re-check authorization before side effects; +- [ ] delegated calls preserve and constrain scope; +- [ ] audit records include requested tool, resolved handler, policy rule, and + runtime decision; +- [ ] tests cover alias bypass, stale approval, preview/execute confusion, and + cross-tenant delegation. + +## Evidence to Request + +- tool policy files and generated tool manifests; +- function-calling or MCP registration code; +- tool router and handler resolution logic; +- approval token generation and validation; +- cache keys for approvals and authorization decisions; +- queue payload schemas and worker code; +- audit event schema and logs; +- tests for denied tools, preview/execute separation, and delegated execution. + +## Reporting Template + +```markdown +### Finding: tool authorization drift in + +Severity: High +Affected flow: runtime handler -> side effect> +Evidence: +- +- +Impact: +Required fix: +- +Verification: +- +``` + +## Common False Positives + +- A preview tool returns text that a human must manually copy into a separate + approved workflow. +- A background worker executes only after re-validating a server-side approval + token bound to the exact action and resource. +- A broad tool name is safe because the handler dispatches to narrower + server-side capabilities and logs the resolved handler. +- Test fixtures intentionally include denied tools but are not reachable in + production routes. + +## References + +- OWASP Agentic AI security patterns for tool misuse and excessive agency +- OWASP LLM06:2025 Excessive Agency +- CWE-285 Improper Authorization +- CWE-863 Incorrect Authorization +- NIST AI RMF Map and Manage functions diff --git a/skills/ai-security/tool-authorization-drift/fixtures/benign/delegated_worker_recheck.yaml b/skills/ai-security/tool-authorization-drift/fixtures/benign/delegated_worker_recheck.yaml new file mode 100644 index 00000000..0fdbb017 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/benign/delegated_worker_recheck.yaml @@ -0,0 +1,35 @@ +declared_policy: + refunds.preview: + action: read + approval_required: false + refunds.execute: + action: write + approval_required: true + required_scope: refunds.execute + +agent_runtime: + allowed_tools: + - refunds.preview + queue_payload: + job_type: refunds.execute + tenant_id: tenant-a + refund_id: refund-123 + approval_id: approval-789 + source: server_side_workflow + +worker: + trust_queue_authorized_flag: false + recheck_policy_before_execute: true + approval_token_bound_to: + - actor_id + - tenant_id + - refund_id + - tool + - action + - policy_version + unknown_tool_default: deny + +audit: + log_requested_tool: refunds.execute + log_resolved_handler: refund_execute_handler + log_policy_version: v4 diff --git a/skills/ai-security/tool-authorization-drift/fixtures/benign/python_bound_approval_token.py b/skills/ai-security/tool-authorization-drift/fixtures/benign/python_bound_approval_token.py new file mode 100644 index 00000000..706fea48 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/benign/python_bound_approval_token.py @@ -0,0 +1,44 @@ +"""Benign approval token binding for exact tool, action, tenant, and resource.""" + +from dataclasses import dataclass +from datetime import datetime, timedelta, timezone + + +@dataclass(frozen=True) +class ApprovalToken: + actor_id: str + tenant_id: str + resource_id: str + tool_name: str + action: str + policy_version: str + expires_at: datetime + + +def mint_approval(actor_id, tenant_id, resource_id, tool_name, action, policy_version): + return ApprovalToken( + actor_id=actor_id, + tenant_id=tenant_id, + resource_id=resource_id, + tool_name=tool_name, + action=action, + policy_version=policy_version, + expires_at=datetime.now(timezone.utc) + timedelta(minutes=10), + ) + + +def can_execute(token, actor_id, tenant_id, resource_id, tool_name, action, policy_version): + expected = ( + token.actor_id == actor_id + and token.tenant_id == tenant_id + and token.resource_id == resource_id + and token.tool_name == tool_name + and token.action == action + and token.policy_version == policy_version + ) + return expected and datetime.now(timezone.utc) < token.expires_at + + +if __name__ == "__main__": + approval = mint_approval("agent-1", "tenant-a", "invoice-7", "invoice.send", "write", "v3") + print(can_execute(approval, "agent-1", "tenant-a", "invoice-7", "invoice.send", "write", "v3")) diff --git a/skills/ai-security/tool-authorization-drift/fixtures/benign/separate_preview_execute_policy.yaml b/skills/ai-security/tool-authorization-drift/fixtures/benign/separate_preview_execute_policy.yaml new file mode 100644 index 00000000..a895a32d --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/benign/separate_preview_execute_policy.yaml @@ -0,0 +1,30 @@ +tool_policy: + invoice.preview: + action: read + approval_required: false + side_effect: none + invoice.send: + action: write + approval_required: true + required_scope: invoice.send + side_effect: sends_invoice + +workflow: + name: invoice_assistant + steps: + - tool: invoice.preview + output: non_executable_summary + - tool: invoice.send + requires: + approval_token_bound_to: + - actor_id + - tenant_id + - invoice_id + - tool + - action + - policy_version + input_from: server_side_invoice_record + +controls: + preview_output_is_not_execute_payload: true + unknown_tool_default: deny diff --git a/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/delegated_worker_bypass.yaml b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/delegated_worker_bypass.yaml new file mode 100644 index 00000000..9e1be85e --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/delegated_worker_bypass.yaml @@ -0,0 +1,24 @@ +declared_policy: + refunds.preview: + action: read + approval_required: false + refunds.execute: + action: write + approval_required: true + +agent_runtime: + allowed_tools: + - refunds.preview + queue_payload: + job_type: refunds.execute + tenant_id: tenant-a + refund_id: refund-123 + authorized: true + source: model_generated_payload + +worker: + trust_queue_authorized_flag: true + recheck_policy_before_execute: false + +risk: + finding: delegated worker executes a denied action based on payload-provided authorization diff --git a/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/preview_execute_confusion.yaml b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/preview_execute_confusion.yaml new file mode 100644 index 00000000..ca81bbd7 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/preview_execute_confusion.yaml @@ -0,0 +1,20 @@ +tool_policy: + invoice.preview: + action: read + approval_required: false + invoice.send: + action: write + approval_required: true + allowed_roles: [] + +workflow: + name: invoice_assistant + steps: + - tool: invoice.preview + output: preview_payload + - tool: invoice.send + input_from: preview_payload + skip_approval_if_preview_exists: true + +risk: + finding: preview output can be forwarded into an execute-capable tool without approval diff --git a/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/python_stale_approval_cache.py b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/python_stale_approval_cache.py new file mode 100644 index 00000000..7c0b2314 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/fixtures/vulnerable/python_stale_approval_cache.py @@ -0,0 +1,29 @@ +"""Vulnerable approval cache: approval is not bound to action or resource.""" + +from datetime import datetime, timedelta, timezone + + +APPROVAL_CACHE = {} + + +def cache_key(actor_id, tenant_id): + # Vulnerable: missing tool name, action, resource, policy version, and scope. + return f"{actor_id}:{tenant_id}" + + +def approve(actor_id, tenant_id): + APPROVAL_CACHE[cache_key(actor_id, tenant_id)] = { + "expires_at": datetime.now(timezone.utc) + timedelta(hours=4), + } + + +def can_execute(actor_id, tenant_id, tool_name, resource_id): + approval = APPROVAL_CACHE.get(cache_key(actor_id, tenant_id)) + if not approval: + return False + return datetime.now(timezone.utc) < approval["expires_at"] + + +if __name__ == "__main__": + approve("agent-1", "tenant-a") + print(can_execute("agent-1", "tenant-a", "payment.execute", "payment-9")) diff --git a/skills/ai-security/tool-authorization-drift/references/patterns.md b/skills/ai-security/tool-authorization-drift/references/patterns.md new file mode 100644 index 00000000..18b4c0d2 --- /dev/null +++ b/skills/ai-security/tool-authorization-drift/references/patterns.md @@ -0,0 +1,45 @@ +# Tool Authorization Drift Patterns + +## Vulnerable Patterns + +| Pattern | Why it matters | Review signal | +|---|---|---| +| Preview result forwarded to executor | A read-only tool becomes a write path | `preview_output` used as execute payload | +| Tool alias checked after dispatch | Denied tools reachable by alternate names | `aliases`, `router`, `handler_name` mismatch | +| Approval cache missing action | Preview approval reused for execute | cache key lacks `tool` or `action` | +| Delegated worker trusts payload | Background side effects bypass original policy | queue job contains `authorized: true` from request | +| Missing policy defaults to allow | New tools ship without review | `policy.get(name, allow)` or permissive fallback | +| Model-supplied scope | Authorization fields come from LLM output | `tenant`, `role`, `approval_id` copied from tool args | +| Runtime logs only requested tool | Audit hides resolved handler | no canonical handler or policy version in audit | + +## Safe Patterns + +| Control | Expected evidence | +|---|---| +| Capability-per-action model | `invoice.preview` and `invoice.send` checked separately | +| Runtime enforcement | handler re-checks policy immediately before side effects | +| Bound approval token | actor, tenant, resource, tool, action, policy version, and expiry | +| Complete cache key | cache includes actor, tenant, resource, tool, action, policy version | +| Alias canonicalization before check | requested alias resolves before policy evaluation | +| Worker re-check | queue worker validates server-side approval before execute | +| Dual audit events | requested tool and resolved handler logged with decision | + +## Suggested Search Terms + +- `tool_policy`, `tool_manifest`, `allowed_tools`, `denied_tools` +- `preview`, `dry_run`, `execute`, `apply`, `commit` +- `approval_id`, `approval_token`, `consent`, `confirmation` +- `cache_key`, `authorization_cache`, `ttl` +- `handler`, `router`, `alias`, `canonical_tool` +- `queue`, `worker`, `job`, `delegated`, `sub_agent` + +## Review Questions + +1. Does every runtime handler map to a declared capability? +2. Is authorization enforced in the handler and worker, not only before model + selection? +3. Can preview approval or output reach execute behavior? +4. Are approval tokens and caches bound to exact action and resource? +5. Do delegated jobs re-check scope using server-side context? +6. Do audit logs show requested tool, resolved handler, policy rule, and runtime + decision?