Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Theme: Application-Layer Permission Engine

A layered, **fully opt-in** permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a `permissions:` block is present, every tool call flows through a hardline catastrophic-command floor, a declarative `allow/ask/deny` rule layer, and a blast-radius risk classifier, resolving to allow / **ask** / deny. An `ask` routes through a pluggable approval provider: an interactive TUI modal (allow once / session / always · reject), an automation policy (`risk-based` / `deny-all` / `allow-all`), or a file request/response handshake for headless/remote approval. Approvals are recorded in an append-only audit ledger; per-agent **role presets** (e.g. `read-only`) scope each agent and also empty the SRT writable set as an OS backstop. A channel-based **guardrail system prompt** tells the model to follow blocks and surface-and-ask rather than circumvent — while keeping `ask` a sanctioned path. **Presence-gated**: a config with no `permissions:` block is 100% unchanged. All items landed under TDD (tests first, confirmed red, then green), with live verification across automation runs. **Honest scope**: the prompt + regex classifier are best-effort *alignment*, not enforcement — the OS sandbox (v0.1.96) remains the load-bearing control (see `docs/dev_notes/permissions_p2_followups.md`).
A layered, **fully opt-in** permission system for agent tool calls — the application-layer companion to v0.1.96's OS sandbox. When a `permissions:` block is present, every tool call flows through a hardline catastrophic-command floor, a declarative `allow/ask/deny` rule layer, and a blast-radius risk classifier, resolving to allow / **ask** / deny. An `ask` routes through a pluggable approval provider: an automation policy (`risk-based` / `deny-all` / `allow-all`) or a file request/response handshake for headless/remote approval. Approvals are recorded in an append-only audit ledger; per-agent **role presets** (e.g. `read-only`) scope each agent and also empty the SRT writable set as an OS backstop. A channel-based **guardrail system prompt** tells the model to follow blocks and surface-and-ask rather than circumvent — while keeping `ask` a sanctioned path. **Presence-gated**: a config with no `permissions:` block is 100% unchanged. All items landed under TDD (tests first, confirmed red, then green), with live verification across automation runs. **Honest scope**: the prompt + regex classifier are best-effort *alignment*, not enforcement — the OS sandbox (v0.1.96) remains the load-bearing control (see `docs/dev_notes/permissions_p2_followups.md`).

### Added
- **Permission engine (opt-in `permissions:` block)**: composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable hardline blocklist (`hardline.py`, catastrophic patterns like `rm -rf /`, fork bombs, raw-disk `dd`), a declarative `action(target)` rule layer (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, **deny-wins** across scopes), and a blast-radius `RiskClassifier` (`risk_classifier.py`: tiers by what the call *does* — egress/force-push/publish/privilege → high, reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook.
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — `CallbackApprovalProvider` → interactive **TUI modal** (`ToolApprovalModal`: allow once/session/always · reject), `PolicyApprovalProvider` → automation default (`risk-based` ships default; high denied with reason, low/medium allowed), and `FileApprovalProvider` → `req_*.json`/`resp_*.json` handshake for headless/remote (fail-closed on timeout).
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — `PolicyApprovalProvider` → automation default (`risk-based` ships default; high denied with reason, low/medium allowed) and `FileApprovalProvider` → `req_*.json`/`resp_*.json` handshake for headless/remote approval (Slack bot, `/approve <id>`, …). Both are live-verified and fail-closed on timeout.
- **Per-agent role scoping**: `permissions.role` presets (`read-only`/`researcher` deny writes+shell; `read-write`/`implementer` fall through to rules+risk), merged with user rules deny-wins. A `read-only` role also empties the agent's SRT writable set (OS-layer backstop to the engine's write denials).
- **Audit ledger + runaway guard (`ledger.py`)**: `ApprovalLedger` writes one append-only JSONL line per approval decision (who/what/why/outcome, crash-safe). `ApprovalBudget` caps consecutive auto-approvals per agent (opt-in `max_consecutive_auto`; fail-closed past the cap, reset by any human decision).
- **`always`-grant persistence**: an operator's "Always" approval persists as a deduped `allow(...)` rule in `settings.local.json` and loads back as a merged scope next run (opt-out `persist_approvals: false`).
Expand All @@ -28,11 +28,11 @@ A layered, **fully opt-in** permission system for agent tool calls — the appli
- **Backend parity guard**: native backends (`claude_code`, `codex`) don't run the framework `PreToolUse` chokepoint, so a `permissions:` block there is reported **INACTIVE** at startup (loud warning) and inert hooks are skipped — preventing a false promise of enforcement.

### Tests
- New deterministic suites: `test_permissions_core.py`, `test_permission_rules.py`, `test_permission_hooks.py`, `test_permission_coordinator.py`, `test_approval_provider.py` / `test_file_approval_provider.py`, `test_approval_ledger.py`, `test_tool_approval_modal.py`, `test_permissions_optional.py` (opt-in/presence gate + parity guard), `test_permission_persistence.py` (write↔load roundtrip + dedup), `test_permission_guardrail_prompt.py` (gating + content incl. ask-is-sanctioned), `test_permission_denied_tool_visibility.py` (start→error-complete events + command preview), plus SRT read-only backstop in `test_srt_manager.py` / `test_srt_filesystem_integration.py`.
- New deterministic suites: `test_permissions_core.py`, `test_permission_rules.py`, `test_permission_hooks.py`, `test_permission_coordinator.py`, `test_approval_provider.py` / `test_file_approval_provider.py`, `test_approval_ledger.py`, `test_permissions_optional.py` (opt-in/presence gate + parity guard), `test_permission_persistence.py` (write↔load roundtrip + dedup), `test_permission_guardrail_prompt.py` (gating + content incl. ask-is-sanctioned), `test_permission_denied_tool_visibility.py` (start→error-complete events + command preview), plus SRT read-only backstop in `test_srt_manager.py` / `test_srt_filesystem_integration.py`.
- Live-verified (automation, `gemini-3-flash-preview`): all three chokepoint branches end-to-end (allow / deny-rule / ask→policy-deny + ledger), guardrail policy present in the real system message, denied calls emitting real `tool_start`/`tool_complete(error)` events with the command. Documented honest limitation: the model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`, confirming the OS sandbox is the load-bearing control.

### Documentations, Configurations and Resources
- **New Configs**: `massgen/configs/tools/permissions/permission_engine.yaml` (risk-tiered approval + rule algebra), `per_agent_roles.yaml` (role scoping), `permission_modal_interactive.yaml` (interactive approval-modal demo + automation deny path).
- **New Configs**: `massgen/configs/tools/permissions/permission_engine.yaml` (risk-tiered approval + rule algebra), `per_agent_roles.yaml` (role scoping).
- **Design Notes**: `docs/dev_notes/permission_systems_research.md` (three-layer model), `docs/dev_notes/permissions_p2_followups.md` (limitations, manual-test gaps, OS-enforcement follow-up).

## [0.1.96] - 2026-06-10
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ This project started with the "threads of thought" and "iterative refinement" id

**What's New in v0.1.97** (Application-Layer Permission Engine):
- **🛡️ Layered Permission Engine** - Opt-in `permissions:` block routes every tool call through a non-overridable **hardline** floor (`rm -rf /`, fork bombs), declarative **`allow/ask/deny` rules** over a small `action(target)` algebra (deny-wins), and a **blast-radius risk classifier** — auto-allowing reads/in-workspace edits and asking only for the dangerous tail (egress, force-push, publish, privilege). The app-layer companion to v0.1.96's OS sandbox.
- **✋ Approval That Fits the Run** - An `ask` pops an interactive **modal** (allow once / session / always · reject) when a human is present, or resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval. Fail-closed by design.
- **✋ Approval That Fits the Run** - An `ask` resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval (Slack bot, `/approve <id>`, …) — fail-closed by design.
- **🧑‍🤝‍🧑 Roles, Audit & Guards** - Per-agent `role` presets (e.g. `read-only`, which also empties the agent's OS-sandbox writable set), an append-only JSONL **audit ledger** of every decision, a runaway-loop **budget**, `always`-grant persistence, and a channel-based **guardrail prompt** that nudges the model to surface blocks rather than circumvent them while keeping `ask` sanctioned. *(Honest scope: the prompt is best-effort alignment; the OS sandbox is the enforcement.)*

**Install v0.1.97:**
Expand Down Expand Up @@ -1247,7 +1247,7 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

#### Application-Layer Permission Engine
- **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote fail-closed on timeout
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) and **file** request/response handshake for headless/remote (both live-verified, fail-closed on timeout)
- **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empties the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`**, and `always`-grant persistence to `settings.local.json`
- **Guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Authority is established by channel (only the system prompt is authoritative). Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline
- **Presence-gated & honest**: a config with no `permissions:` block is 100% unchanged; native backends (claude_code/codex) report **INACTIVE** rather than silently inert. All under TDD; live-verified that the prompt is best-effort *alignment* (a model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`), so the OS sandbox remains the load-bearing enforcement
Expand Down
4 changes: 2 additions & 2 deletions README_PYPI.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ This project started with the "threads of thought" and "iterative refinement" id

**What's New in v0.1.97** (Application-Layer Permission Engine):
- **🛡️ Layered Permission Engine** - Opt-in `permissions:` block routes every tool call through a non-overridable **hardline** floor (`rm -rf /`, fork bombs), declarative **`allow/ask/deny` rules** over a small `action(target)` algebra (deny-wins), and a **blast-radius risk classifier** — auto-allowing reads/in-workspace edits and asking only for the dangerous tail (egress, force-push, publish, privilege). The app-layer companion to v0.1.96's OS sandbox.
- **✋ Approval That Fits the Run** - An `ask` pops an interactive **modal** (allow once / session / always · reject) when a human is present, or resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval. Fail-closed by design.
- **✋ Approval That Fits the Run** - An `ask` resolves via an automation **policy** (`risk-based` / `deny-all` / `allow-all`) or a **file** request/response handshake for headless/remote approval (Slack bot, `/approve <id>`, …) — fail-closed by design.
- **🧑‍🤝‍🧑 Roles, Audit & Guards** - Per-agent `role` presets (e.g. `read-only`, which also empties the agent's OS-sandbox writable set), an append-only JSONL **audit ledger** of every decision, a runaway-loop **budget**, `always`-grant persistence, and a channel-based **guardrail prompt** that nudges the model to surface blocks rather than circumvent them while keeping `ask` sanctioned. *(Honest scope: the prompt is best-effort alignment; the OS sandbox is the enforcement.)*

**Install v0.1.97:**
Expand Down Expand Up @@ -1246,7 +1246,7 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

#### Application-Layer Permission Engine
- **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote fail-closed on timeout
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) and **file** request/response handshake for headless/remote (both live-verified, fail-closed on timeout)
- **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empties the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`**, and `always`-grant persistence to `settings.local.json`
- **Guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Authority is established by channel (only the system prompt is authoritative). Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline
- **Presence-gated & honest**: a config with no `permissions:` block is 100% unchanged; native backends (claude_code/codex) report **INACTIVE** rather than silently inert. All under TDD; live-verified that the prompt is best-effort *alignment* (a model evaded the regex egress classifier via `\c\u\r\l` / `python urllib`), so the OS sandbox remains the load-bearing enforcement
Expand Down
9 changes: 0 additions & 9 deletions RELEASE_NOTES_v0.1.97.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
- **Risk classifier**: tiers a call by blast radius, not name — auto-allows reads and in-workspace edits, asks only for the dangerous tail (network egress, force-push, publish/spend, privilege escalation).

### ✋ Approval that fits the run
- **Interactive modal** (`ToolApprovalModal`): allow once / allow session / always · reject, when a human is present.
- **Automation policy**: `risk-based` (default — high denied with a reason, low/medium allowed), `deny-all`, or `allow-all`.
- **File handshake** (`FileApprovalProvider`): `req_*.json` / `resp_*.json` for headless/remote approval (Slack bot, `/approve <id>`, …). Fail-closed on timeout throughout.

Expand All @@ -30,14 +29,6 @@

### 📖 Getting Started
- [**Quick Start Guide**](https://github.com/massgen/MassGen?tab=readme-ov-file#1--installation): upgrade and try the permission engine.
- **Try the approval modal (interactive):**

```bash
# A high-risk command pops the approval modal (allow once/session/always · reject)
uv run massgen --config massgen/configs/tools/permissions/permission_modal_interactive.yaml \
"Run the shell command: curl -s https://example.com"
```

- **Try risk-tiered automation (headless deny):**

```bash
Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow

### Features
- **Permission engine (opt-in `permissions:` block)**: a composite `PreToolUse` pipeline in `massgen/permissions/` — a non-overridable **hardline** blocklist (`hardline.py`: catastrophic patterns like `rm -rf /`, fork bombs, raw-disk `dd`), a declarative **`action(target)` rule layer** (`rules.py`: `command`/`read_file`/`write_file`/`read_url`/`mcp`/`*`, deny-wins across scopes), and a **blast-radius `RiskClassifier`** that tiers by what the call does (egress/force-push/publish/privilege → high; reads/in-workspace edits → low). An explicit rule suppresses the risk-ask, so rules + risk live in one hook
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — interactive **modal** (`ToolApprovalModal`: allow once/session/always · reject), automation **policy** (`risk-based` default / `deny-all` / `allow-all`), or **file** request/response handshake for headless/remote (fail-closed on timeout)
- **Approval round-trip**: the `base_with_custom_tool_and_mcp` chokepoint resolves an `ask` via a pluggable `ApprovalProvider` — automation **policy** (`risk-based` default / `deny-all` / `allow-all`) or **file** request/response handshake for headless/remote (fail-closed on timeout)
- **Roles, audit & guards**: per-agent `role` presets (`read-only`/`researcher` deny writes+shell, also empty the agent's SRT writable set), an append-only JSONL **`ApprovalLedger`**, a runaway-loop **`ApprovalBudget`** (opt-in `max_consecutive_auto`), and `always`-grant persistence to `settings.local.json`
- **Channel-based guardrail system prompt** (`PermissionGuardrailSection`, injected only when the engine is active): follow the guardrails, don't circumvent a denial, surface-and-ask — while keeping `ask` a sanctioned path. Denied tool calls now render as **first-class failed tool events** (with the command) in the TUI/WebUI timeline
- **Backend parity guard**: native backends (`claude_code`, `codex`) lack the framework chokepoint, so a `permissions:` block there is reported **INACTIVE** instead of silently inert
Expand Down
Loading
Loading