diff --git a/CHANGELOG.md b/CHANGELOG.md index eb38eebaa..0de7b929f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,11 +11,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Theme: Application-Layer Permission Engine -A layered, **fully opt-in** permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a `permissions:` block is present, every tool call flows through a hardline catastrophic-command floor, a declarative `allow/ask/deny` rule layer, and a blast-radius risk classifier, resolving to allow / **ask** / deny. An `ask` routes through a pluggable approval provider: an interactive TUI modal (allow once / session / always · reject), an automation policy (`risk-based` / `deny-all` / `allow-all`), or a file request/response handshake for headless/remote approval. Approvals are recorded in an append-only audit ledger; per-agent **role presets** (e.g. `read-only`) scope each agent and also empty the SRT writable set as an OS backstop. A channel-based **guardrail system prompt** tells the model to follow blocks and surface-and-ask rather than circumvent — while keeping `ask` a sanctioned path. **Presence-gated**: a config with no `permissions:` block is 100% unchanged. All items landed under TDD (tests first, confirmed red, then green), with live verification across automation runs. **Honest scope**: the prompt + regex classifier are best-effort *alignment*, not enforcement — the OS sandbox (v0.1.96) remains the load-bearing control (see `docs/dev_notes/permissions_p2_followups.md`). +A layered, **fully opt-in** permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a `permissions:` block is present, every tool call flows through a hardline catastrophic-command floor, a declarative `allow/ask/deny` rule layer, and a blast-radius risk classifier, resolving to allow / **ask** / deny. An `ask` routes through a pluggable approval provider: an automation policy (`risk-based` / `deny-all` / `allow-all`) or a file request/response handshake for headless/remote approval. Approvals are recorded in an append-only audit ledger; per-agent **role presets** (e.g. `read-only`) scope each agent and also empty the SRT writable set as an OS backstop. A channel-based **guardrail system prompt** tells the model to follow blocks and surface-and-ask rather than circumvent — while keeping `ask` a sanctioned path. **Presence-gated**: a config with no `permissions:` block is 100% unchanged. All items landed under TDD (tests first, confirmed red, then green), with live verification across automation runs. **Honest scope**: the prompt + regex classifier are best-effort *alignment*, not enforcement — the OS sandbox (v0.1.96) remains the load-bearing control (see `docs/dev_notes/permissions_p2_followups.md`). ### Added - **Permission engine (opt-in `permissions:` block)**: composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable hardline blocklist (`hardline.py`, catastrophic patterns like `rm -rf /`, fork bombs, raw-disk `dd`), a declarative `action(target)` rule layer (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, **deny-wins** across scopes), and a blast-radius `RiskClassifier` (`risk_classifier.py`: tiers by what the call *does* — egress/force-push/publish/privilege → high, reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook. -- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — `CallbackApprovalProvider` → interactive **TUI modal** (`ToolApprovalModal`: allow once/session/always · reject), `PolicyApprovalProvider` → automation default (`risk-based` ships default; high denied with reason, low/medium allowed), and `FileApprovalProvider` → `req_*.json`/`resp_*.json` handshake for headless/remote (fail-closed on timeout). +- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — `PolicyApprovalProvider` → automation default (`risk-based` ships default; high denied with reason, low/medium allowed) and `FileApprovalProvider` → `req_*.json`/`resp_*.json` handshake for headless/remote approval (Slack bot, `/approve `, …). Both are live-verified and fail-closed on timeout. - **Per-agent role scoping**: `permissions.role` presets (`read-only`/`researcher` deny writes+shell; `read-write`/`implementer` fall through to rules+risk), merged with user rules deny-wins. A `read-only` role also empties the agent's SRT writable set (OS-layer backstop to the engine's write denials). - **Audit ledger + runaway guard (`ledger.py`)**: `ApprovalLedger` writes one append-only JSONL line per approval decision (who/what/why/outcome, crash-safe). `ApprovalBudget` caps consecutive auto-approvals per agent (opt-in `max_consecutive_auto`; fail-closed past the cap, reset by any human decision). - **`always`-grant persistence**: an operator's "Always" approval persists as a deduped `allow(...)` rule in `settings.local.json` and loads back as a merged scope next run (opt-out `persist_approvals: false`). @@ -28,11 +28,11 @@ A layered, **fully opt-in** permission system for agent tool calls — the appli - **Backend parity guard**: native backends (`claude_code`, `codex`) don't run the framework `PreToolUse` chokepoint, so a `permissions:` block there is reported **INACTIVE** at startup (loud warning) and inert hooks are skipped — preventing a false promise of enforcement. ### Tests -- New deterministic suites: `test_permissions_core.py`, `test_permission_rules.py`, `test_permission_hooks.py`, `test_permission_coordinator.py`, `test_approval_provider.py` / `test_file_approval_provider.py`, `test_approval_ledger.py`, `test_tool_approval_modal.py`, `test_permissions_optional.py` (opt-in/presence gate + parity guard), `test_permission_persistence.py` (write↔load roundtrip + dedup), `test_permission_guardrail_prompt.py` (gating + content incl. ask-is-sanctioned), `test_permission_denied_tool_visibility.py` (start→error-complete events + command preview), plus SRT read-only backstop in `test_srt_manager.py` / `test_srt_filesystem_integration.py`. +- New deterministic suites: `test_permissions_core.py`, `test_permission_rules.py`, `test_permission_hooks.py`, `test_permission_coordinator.py`, `test_approval_provider.py` / `test_file_approval_provider.py`, `test_approval_ledger.py`, `test_permissions_optional.py` (opt-in/presence gate + parity guard), `test_permission_persistence.py` (write↔load roundtrip + dedup), `test_permission_guardrail_prompt.py` (gating + content incl. ask-is-sanctioned), `test_permission_denied_tool_visibility.py` (start→error-complete events + command preview), plus SRT read-only backstop in `test_srt_manager.py` / `test_srt_filesystem_integration.py`. - Live-verified (automation, `gemini-3-flash-preview`): all three chokepoint branches end-to-end (allow / deny-rule / ask→policy-deny + ledger), guardrail policy present in the real system message, denied calls emitting real `tool_start`/`tool_complete(error)` events with the command. Documented honest limitation: the model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`, confirming the OS sandbox is the load-bearing control. ### Documentations, Configurations and Resources -- **New Configs**: `massgen/configs/tools/permissions/permission_engine.yaml` (risk-tiered approval + rule algebra), `per_agent_roles.yaml` (role scoping), `permission_modal_interactive.yaml` (interactive approval-modal demo + automation deny path). +- **New Configs**: `massgen/configs/tools/permissions/permission_engine.yaml` (risk-tiered approval + rule algebra), `per_agent_roles.yaml` (role scoping). - **Design Notes**: `docs/dev_notes/permission_systems_research.md` (three-layer model), `docs/dev_notes/permissions_p2_followups.md` (limitations, manual-test gaps, OS-enforcement follow-up). ## [0.1.96] - 2026-06-10 diff --git a/README.md b/README.md index 88964eaac..0ef6082dd 100644 --- a/README.md +++ b/README.md @@ -161,7 +161,7 @@ This project started with the "threads of thought" and "iterative refinement" id **What's New in v0.1.97** (Application-Layer Permission Engine): - **🛡️ Layered Permission Engine** - Opt-in `permissions:` block routes every tool call through a non-overridable **hardline** floor (`rm -rf /`, fork bombs), declarative **`allow/ask/deny` rules** over a small `action(target)` algebra (deny-wins), and a **blast-radius risk classifier** — auto-allowing reads/in-workspace edits and asking only for the dangerous tail (egress, force-push, publish, privilege). The app-layer companion to v0.1.96's OS sandbox. -- **✋ Approval That Fits the Run** - An `ask` pops an interactive **modal** (allow once / session / always · reject) when a human is present, or resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval. Fail-closed by design. +- **✋ Approval That Fits the Run** - An `ask` resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval (Slack bot, `/approve `, …) — fail-closed by design. - **🧑‍🤝‍🧑 Roles, Audit & Guards** - Per-agent `role` presets (e.g. `read-only`, which also empties the agent's OS-sandbox writable set), an append-only JSONL **audit ledger** of every decision, a runaway-loop **budget**, `always`-grant persistence, and a channel-based **guardrail prompt** that nudges the model to surface blocks rather than circumvent them while keeping `ask` sanctioned. *(Honest scope: the prompt is best-effort alignment; the OS sandbox is the enforcement.)* **Install v0.1.97:** @@ -1247,7 +1247,7 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch #### Application-Layer Permission Engine - **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook -- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote — fail-closed on timeout +- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) and **file** request/response handshake for headless/remote (both live-verified, fail-closed on timeout) - **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empties the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`**, and `always`-grant persistence to `settings.local.json` - **Guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Authority is established by channel (only the system prompt is authoritative). Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline - **Presence-gated & honest**: a config with no `permissions:` block is 100% unchanged; native backends (claude_code/codex) report **INACTIVE** rather than silently inert. All under TDD; live-verified that the prompt is best-effort *alignment* (a model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`), so the OS sandbox remains the load-bearing enforcement diff --git a/README_PYPI.md b/README_PYPI.md index b5745cb7f..5dc9f3ad9 100644 --- a/README_PYPI.md +++ b/README_PYPI.md @@ -160,7 +160,7 @@ This project started with the "threads of thought" and "iterative refinement" id **What's New in v0.1.97** (Application-Layer Permission Engine): - **🛡️ Layered Permission Engine** - Opt-in `permissions:` block routes every tool call through a non-overridable **hardline** floor (`rm -rf /`, fork bombs), declarative **`allow/ask/deny` rules** over a small `action(target)` algebra (deny-wins), and a **blast-radius risk classifier** — auto-allowing reads/in-workspace edits and asking only for the dangerous tail (egress, force-push, publish, privilege). The app-layer companion to v0.1.96's OS sandbox. -- **✋ Approval That Fits the Run** - An `ask` pops an interactive **modal** (allow once / session / always · reject) when a human is present, or resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval. Fail-closed by design. +- **✋ Approval That Fits the Run** - An `ask` resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval (Slack bot, `/approve `, …) — fail-closed by design. - **🧑‍🤝‍🧑 Roles, Audit & Guards** - Per-agent `role` presets (e.g. `read-only`, which also empties the agent's OS-sandbox writable set), an append-only JSONL **audit ledger** of every decision, a runaway-loop **budget**, `always`-grant persistence, and a channel-based **guardrail prompt** that nudges the model to surface blocks rather than circumvent them while keeping `ask` sanctioned. *(Honest scope: the prompt is best-effort alignment; the OS sandbox is the enforcement.)* **Install v0.1.97:** @@ -1246,7 +1246,7 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch #### Application-Layer Permission Engine - **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook -- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote — fail-closed on timeout +- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) and **file** request/response handshake for headless/remote (both live-verified, fail-closed on timeout) - **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empties the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`**, and `always`-grant persistence to `settings.local.json` - **Guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Authority is established by channel (only the system prompt is authoritative). Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline - **Presence-gated & honest**: a config with no `permissions:` block is 100% unchanged; native backends (claude_code/codex) report **INACTIVE** rather than silently inert. All under TDD; live-verified that the prompt is best-effort *alignment* (a model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`), so the OS sandbox remains the load-bearing enforcement diff --git a/RELEASE_NOTES_v0.1.97.md b/RELEASE_NOTES_v0.1.97.md index 921a2ea39..36520d2ad 100644 --- a/RELEASE_NOTES_v0.1.97.md +++ b/RELEASE_NOTES_v0.1.97.md @@ -8,7 +8,6 @@ - **Risk classifier**: tiers a call by blast radius, not name — auto-allows reads and in-workspace edits, asks only for the dangerous tail (network egress, force-push, publish/spend, privilege escalation). ### ✋ Approval that fits the run -- **Interactive modal** (`ToolApprovalModal`): allow once / allow session / always · reject, when a human is present. - **Automation policy**: `risk-based` (default — high denied with a reason, low/medium allowed), `deny-all`, or `allow-all`. - **File handshake** (`FileApprovalProvider`): `req_*.json` / `resp_*.json` for headless/remote approval (Slack bot, `/approve `, …). Fail-closed on timeout throughout. @@ -30,14 +29,6 @@ ### 📖 Getting Started - [**Quick Start Guide**](https://github.com/massgen/MassGen?tab=readme-ov-file#1--installation): upgrade and try the permission engine. -- **Try the approval modal (interactive):** - -```bash -# A high-risk command pops the approval modal (allow once/session/always · reject) -uv run massgen --config massgen/configs/tools/permissions/permission_modal_interactive.yaml \ - "Run the shell command: curl -s https://example.com" -``` - - **Try risk-tiered automation (headless deny):** ```bash diff --git a/ROADMAP.md b/ROADMAP.md index c3a9b0ada..533c1fcbf 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -54,7 +54,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow ### Features - **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: catastrophic patterns like `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook -- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote (fail-closed on timeout) +- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) or **file** request/response handshake for headless/remote (fail-closed on timeout) - **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empty the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`** (opt-in `max_consecutive_auto`), and `always`-grant persistence to `settings.local.json` - **Channel-based guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline - **Backend parity guard**: native backends (`claude_code`, `codex`) lack the framework chokepoint, so a `permissions:` block there is reported **INACTIVE** instead of silently inert diff --git a/docs/announcements/current-release.md b/docs/announcements/current-release.md index 478e3ee18..b8bb1f5c9 100644 --- a/docs/announcements/current-release.md +++ b/docs/announcements/current-release.md @@ -7,7 +7,7 @@ After posting, update the social links below. ## Release Summary -MassGen v0.1.97 — an **application-layer permission engine** for agent tool calls! 🛡️ The companion to v0.1.96's OS sandbox: a fully opt-in pipeline of a hardline catastrophic-command floor, declarative `allow/ask/deny` rules, and a blast-radius risk classifier — resolving to allow / **ask** / deny. An `ask` pops an interactive approval modal (allow once/session/always · reject) or, headless, an automation policy or file handshake. Every decision is audited; per-agent roles scope each agent; a guardrail system prompt nudges the model to surface blocks rather than circumvent them. Presence-gated — no `permissions:` block means nothing changes. +MassGen v0.1.97 — an **application-layer permission engine** for agent tool calls! 🛡️ The companion to v0.1.96's OS sandbox: a fully opt-in pipeline of a hardline catastrophic-command floor, declarative `allow/ask/deny` rules, and a blast-radius risk classifier — resolving to allow / **ask** / deny. An `ask` is resolved by an automation policy (`risk-based`/`deny-all`/`allow-all`) or a file request/response handshake for headless/remote approval. Every decision is audited; per-agent roles scope each agent; a guardrail system prompt nudges the model to surface blocks rather than circumvent them. Presence-gated — no `permissions:` block means nothing changes. ## Install @@ -23,7 +23,7 @@ pip install massgen==0.1.97 ## Posting Notes -- **Suggested image:** A TUI/terminal capture of the approval modal firing on a risky call (`curl …`) with allow once/session/always · reject — or, for an automation run, the denied call rendered as a first-class failed tool row (`🔧 Calling execute_command(curl …) → ❌ Denied by automation policy: high-risk`). Pairs well with the v0.1.96 sandbox image to tell the defense-in-depth story. +- **Suggested image:** A terminal capture of an automation run where a denied call renders as a first-class failed tool row (`🔧 Calling execute_command(curl …) → ❌ Denied by automation policy: high-risk`), alongside an allowed low-risk call. Pairs well with the v0.1.96 sandbox image to tell the defense-in-depth story. --- @@ -39,7 +39,7 @@ MassGen v0.1.97 — an application-layer permission engine for agent tool calls! 🧱 **Hardline + rules + risk** — a non-overridable catastrophic-command floor (`rm -rf /`, fork bombs), declarative `allow/ask/deny` rules over a small `action(target)` algebra (deny-wins), and a blast-radius classifier that auto-allows reads/in-workspace edits and asks only for the dangerous tail (egress, force-push, publish, privilege). -✋ **Approval that fits the run** — an `ask` pops an interactive modal (allow once / session / always · reject) when a human is present, or resolves via an automation policy (`risk-based` / `deny-all` / `allow-all`) or a file request/response handshake for headless/remote approval. Fail-closed by design. +✋ **Approval that fits the run** — an `ask` resolves via an automation policy (`risk-based` / `deny-all` / `allow-all`) or a file request/response handshake for headless/remote approval (Slack bot, `/approve `, …). Fail-closed by design. 🧑‍🤝‍🧑 **Per-agent roles + audit** — scope each agent with a `role` (e.g. `read-only`), which also empties its OS-sandbox writable set; every approval decision lands in an append-only JSONL audit ledger; a runaway-loop budget caps consecutive auto-approvals. diff --git a/docs/dev_notes/permissions_p2_followups.md b/docs/dev_notes/permissions_p2_followups.md index dbdc1744a..1bc4f23be 100644 --- a/docs/dev_notes/permissions_p2_followups.md +++ b/docs/dev_notes/permissions_p2_followups.md @@ -19,8 +19,11 @@ The committed work wired three previously dead / advertised-but-unconnected piec This confirms the risk classifier is a porous denylist (CLAUDE.md already frames it as "a denylist, not content categorization"). **The OS sandbox (SRT) is the real egress control** — it blocks at the network/syscall layer regardless of how the command is spelled. Implication: never present the regex classifier as sufficient on its own; the sandbox follow-up (issue 2) is the load-bearing layer for egress. A regex arms race (adding `python`, escaped-char normalization, etc.) is whack-a-mole and should not be mistaken for a fix. - **`always` rule matching uses fnmatch on the raw target.** Persisted commands containing glob metachars (`*`, `?`, `[`) could over-match on read-back. Human-gated → low risk, but worth hardening (escape / exact-match mode). +### Known bug — interactive approval modal does NOT fire (live wiring broken) +In a live (non-automation) Textual TUI run, a high-risk `ask` is resolved by the **automation PolicyApprovalProvider (deny)** instead of popping `ToolApprovalModal`. **Root cause (timing):** `CoordinationUI._install_interactive_approval_provider` (coordination_ui.py:956, in `coordinate()` at display-init) swaps each agent's `PolicyApprovalProvider`→`CallbackApprovalProvider(modal)`, but it reads `backend._permission_coordinator`, which is created *later* by `_setup_hook_manager_for_agent`→`_install_permission_hooks` (orchestrator.py:5976) during orchestration — after line 956. So at swap time the coordinator is `None`, the swap `continue`s for every agent (no `[TUI] Installed interactive approval provider` log), and the policy provider denies. Confirmed via run `log_20260612_094241_776192`. **Fix options:** call `_install_interactive_approval_provider` *after* hook setup, or have `_install_permission_hooks` install the modal provider directly when an interactive display is present (pass a callback/flag down so coordinator creation + provider selection happen together). Add a regression test asserting the coordinator ends up with a `CallbackApprovalProvider` given an interactive display + permissions. The modal widget, decision mapping, and all three providers are implemented and unit-tested; only the live auto-swap is broken. **v0.1.97 docs intentionally omit the interactive modal** until this lands; the `permission_modal_interactive.yaml` demo config was removed. + ### Manual-test gaps (could not automate) -1. **Interactive approval modal** — needs a live Textual app + real keypresses (Future/`call_from_thread` bridge can't run headless). Decision-mapping is unit-tested; render + round-trip is not. +1. **Cross-run modal / "Always" round-trip** — once the swap bug above is fixed, the live modal still needs manual verification (real keypresses; the Future/`call_from_thread` bridge can't run headless). 2. **Cross-run "Always" persistence** — both halves (write + load-back) are unit-tested and load-back is integration-tested at install, but the true 2-run TUI handoff is manual. 3. **ApprovalBudget under a long real run** — trip/reset/per-agent are unit-tested; behaviour under a genuine 25+ consecutive-auto-approval automation run is not exercised live. diff --git a/docs/source/index.rst b/docs/source/index.rst index 457dd1018..740d926a7 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -211,7 +211,7 @@ Recent Releases **v0.1.97 (June 12, 2026)** - Application-Layer Permission Engine -Adds a layered, fully opt-in permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a ``permissions:`` block is present, every tool call flows through a non-overridable hardline floor, a declarative ``allow/ask/deny`` rule layer (``action(target)`` algebra, deny-wins), and a blast-radius risk classifier, resolving to allow / ask / deny. An ``ask`` routes through a pluggable approval provider: an interactive TUI modal (allow once/session/always · reject), an automation policy (``risk-based``/``deny-all``/``allow-all``), or a file request/response handshake for headless/remote approval. Every decision is recorded in an append-only audit ledger; per-agent ``role`` presets (e.g. ``read-only``) scope each agent and empty its SRT writable set; a runaway-loop budget caps consecutive auto-approvals. A channel-based guardrail system prompt nudges the model to surface blocks rather than circumvent them while keeping ``ask`` sanctioned. Presence-gated — a config with no ``permissions:`` block is unchanged. Honest scope: the prompt + regex classifier are best-effort alignment; the OS sandbox remains the load-bearing enforcement. +Adds a layered, fully opt-in permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a ``permissions:`` block is present, every tool call flows through a non-overridable hardline floor, a declarative ``allow/ask/deny`` rule layer (``action(target)`` algebra, deny-wins), and a blast-radius risk classifier, resolving to allow / ask / deny. An ``ask`` routes through a pluggable approval provider: an automation policy (``risk-based``/``deny-all``/``allow-all``) or a file request/response handshake for headless/remote approval (both live-verified, fail-closed on timeout). Every decision is recorded in an append-only audit ledger; per-agent ``role`` presets (e.g. ``read-only``) scope each agent and empty its SRT writable set; a runaway-loop budget caps consecutive auto-approvals. A channel-based guardrail system prompt nudges the model to surface blocks rather than circumvent them while keeping ``ask`` sanctioned. Presence-gated — a config with no ``permissions:`` block is unchanged. Honest scope: the prompt + regex classifier are best-effort alignment; the OS sandbox remains the load-bearing enforcement. **v0.1.96 (June 10, 2026)** - OS-Level Agent Sandboxing diff --git a/massgen/configs/README.md b/massgen/configs/README.md index 7e5b9da6d..99adfcc45 100644 --- a/massgen/configs/README.md +++ b/massgen/configs/README.md @@ -228,11 +228,11 @@ Most configurations use environment variables for API keys:so ## Release History & Examples ### v0.1.97 - Latest -**Application-Layer Permission Engine:** Opt-in `allow/ask/deny` rules + risk-tiered approval (modal / automation policy / file), audit ledger, per-agent roles, guardrail prompt — the app-layer companion to v0.1.96's OS sandbox +**Application-Layer Permission Engine:** Opt-in `allow/ask/deny` rules + risk-tiered approval (automation policy / file handshake), audit ledger, per-agent roles, guardrail prompt — the app-layer companion to v0.1.96's OS sandbox **Key Features:** - **Permission engine** (opt-in `permissions:` block): hardline catastrophic-command floor → declarative `action(target)` rules (deny-wins) → blast-radius risk classifier → allow / **ask** / deny -- **Approval that fits the run**: interactive **modal** (allow once/session/always · reject), automation **policy** (`risk-based`/`deny-all`/`allow-all`), or **file** request/response handshake for headless/remote — fail-closed by design +- **Approval that fits the run**: automation **policy** (`risk-based`/`deny-all`/`allow-all`) or **file** request/response handshake for headless/remote — fail-closed by design - **Roles, audit & guards**: per-agent `role` presets (e.g. `read-only`, also empties the SRT writable set), append-only JSONL **audit ledger**, runaway-loop **budget**, and `always`-grant persistence - **Guardrail-aware prompt**: when active, the system prompt tells the model to surface blocks rather than circumvent them while keeping `ask` sanctioned (best-effort alignment; OS sandbox is enforcement) - **Presence-gated**: a config with no `permissions:` block is 100% unchanged @@ -241,10 +241,6 @@ Most configurations use environment variables for API keys:so ```bash pip install massgen==0.1.97 -# Interactive: a high-risk command pops the approval modal (allow once/session/always · reject) -uv run massgen --config massgen/configs/tools/permissions/permission_modal_interactive.yaml \ - "Run the shell command: curl -s https://example.com" - # Automation (risk-based): git status runs, the force-push is denied with a reason uv run massgen --automation --config massgen/configs/tools/permissions/permission_engine.yaml \ "Run 'git status', then run 'git push --force origin main' and report each result." diff --git a/massgen/configs/tools/permissions/permission_modal_interactive.yaml b/massgen/configs/tools/permissions/permission_modal_interactive.yaml deleted file mode 100644 index e9f5e1973..000000000 --- a/massgen/configs/tools/permissions/permission_modal_interactive.yaml +++ /dev/null @@ -1,52 +0,0 @@ -# Permissions — interactive approval MODAL demo (self-contained, benign). -# -# Same engine as permission_engine.yaml, but display_type: textual_terminal so a -# per-tool `ask` pops the interactive approval modal (Allow once/session/always | -# Reject) instead of being resolved by the automation policy. -# -# Two ways to run (no setup needed — the workspace is created automatically): -# -# INTERACTIVE (modal): a high-risk command pops the approval modal. -# uv run massgen --config massgen/configs/tools/permissions/permission_modal_interactive.yaml \ -# "Run the shell command: curl -s https://example.com" -# → expect an approval modal; try Allow once / session / always / Reject. -# → click "Always" then re-run the SAME command to see it NOT re-prompt -# (a persisted allow rule is written to .massgen/settings.local.json). -# -# AUTOMATION (no human): the same ask is resolved by automation_default instead. -# uv run massgen --automation --config massgen/configs/tools/permissions/permission_modal_interactive.yaml \ -# "Run two commands and report each: (1) echo HELLO_OK (2) echo BLOCKED_secret" -# → (1) echoes (low risk → allow); (2) blocked by the deny rule below. -# -# Safety: only benign commands are suggested above. `curl https://example.com` is a -# harmless fetch even if approved; `echo` is inert. No destructive operations. - -agents: - - id: "guarded" - backend: - # Gemini reliably ISSUES the tool call, so the approval modal actually fires. - # (gpt-5-nano tended to answer without attempting the command → no modal.) - type: "gemini" - model: "gemini-3-flash-preview" - cwd: "permission_demo_workspace" # created automatically; relative → portable - enable_mcp_command_line: true - permissions: - enabled: true - automation_default: "risk-based" # used only under --automation (no human) - audit: true # append-only ledger at .massgen/approvals/ledger.jsonl - # max_consecutive_auto: 25 # runaway-loop guard (omit → unlimited) - # persist_approvals: true # 'Always' grants persist + load back (default on) - rules: - deny: - - "command(echo BLOCKED*)" # deterministic deny for the automation demo - - "command(git push --force*)" - ask: - - "read_url(*)" # any network fetch → approval modal - -orchestrator: - snapshot_storage: "snapshots" - agent_temporary_workspace: "temp_workspaces" - -ui: - display_type: "textual_terminal" # the Textual TUI that hosts the approval modal - logging_enabled: true