Skip to content

feat: v0.1.97#1126

Merged
Henry-811 merged 19 commits into
mainfrom
dev/v0.1.97
Jun 12, 2026
Merged

feat: v0.1.97#1126
Henry-811 merged 19 commits into
mainfrom
dev/v0.1.97

Conversation

@Henry-811

@Henry-811 Henry-811 commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

PR Title Format

Your PR title must follow the format: <type>: <brief description>

Valid types:

  • fix: - Bug fixes
  • feat: - New features
  • breaking: - Breaking changes
  • docs: - Documentation updates
  • refactor: - Code refactoring
  • test: - Test additions/modifications
  • chore: - Maintenance tasks
  • perf: - Performance improvements
  • style: - Code style changes
  • ci: - CI/CD configuration changes

Examples:

  • fix: resolve memory leak in data processing
  • feat: add export to CSV functionality
  • breaking: change API response format
  • docs: update installation guide

Description

Brief description of the changes in this PR

Type of change

  • Bug fix (fix:) - Non-breaking change which fixes an issue
  • New feature (feat:) - Non-breaking change which adds functionality
  • Breaking change (breaking:) - Fix or feature that would cause existing functionality to not work as expected
  • Documentation (docs:) - Documentation updates
  • Code refactoring (refactor:) - Code changes that neither fix a bug nor add a feature
  • Tests (test:) - Adding missing tests or correcting existing tests
  • Chore (chore:) - Maintenance tasks, dependency updates, etc.
  • Performance improvement (perf:) - Code changes that improve performance
  • Code style (style:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)
  • CI/CD (ci:) - Changes to CI/CD configuration files and scripts

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

Expected Results

Description/screenshots of expected results.

Additional context

Add any other context about the PR here.

Summary by CodeRabbit

  • New Features

    • Opt‑in, application‑layer permission engine for agent tool calls (deny‑wins rules, risk‑tiering, per‑agent roles).
    • Interactive approval modal, headless file‑based approvals, and policy/automation defaults.
    • Append‑only approval audit ledger and persistent “always” approvals.
    • Denied tool calls now surface explicit denied events with a command preview.
    • Read‑only role propagates to OS sandbox behavior to prevent writes.
  • Documentation

    • Updated release notes, changelog, READMEs, roadmap, and docs for v0.1.97.
  • Chores

    • Package version updated to 0.1.97.

ncrispino and others added 6 commits June 11, 2026 11:49
…e + hardline blocklist

The PreToolUse 'ask' decision was a silent no-op (the chokepoint handled deny but
fell through on ask). P0 makes approval real and opt-in via a backend 'permissions:'
block:

- massgen/permissions/ package: AuthorizationObject/ApprovalDecision models;
  RiskClassifier (tier by blast radius / command danger, not tool name);
  SessionApprovalCache (once/session/always grants); hardline blocklist
  (catastrophic commands, immune to bypass); ApprovalProvider ABC with
  PolicyApprovalProvider (configurable automation default: risk-based|deny-all|
  allow-all) and CallbackApprovalProvider (interactive, fail-closed); composite
  PermissionEngineHook (one hook: risk -> allow/ask, since execute_hooks makes ask
  win over allow) + HardlineBlocklistHook; PermissionCoordinator (resolves ask:
  cache -> authz -> provider -> grant).
- Chokepoint (_execute_tool_with_logging): new 'ask' branch routes through the
  coordinator; fails closed when unconfigured (never silently executes).
- Hook installer registers the hooks + coordinator when 'permissions:' opts in.
- Research basis: docs/dev_notes/permission_systems_research.md (12-CLI + HITL).

Live-verified (automation, risk-based): echo (low) runs; curl egress (high) denied
with reason fed back. 46 unit tests; hook-framework + security suites unaffected.
Rule layer (P1.1) + interactive modal + handshake hardening to follow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on(target))

Adds the Antigravity-style action(target) rule algebra on top of the P0 engine:
  permissions.rules: { deny: [...], ask: [...], allow: [...] }
Actions: command | read_file | write_file | read_url | mcp | * ; targets are
case-sensitive globs. Precedence deny > allow > ask; a matching allow/deny is
authoritative and SUPPRESSES the risk classifier (which is why rules+risk are one
engine hook — ask wins over allow in hook aggregation). No rule match → falls
through to the risk classifier. PermissionRuleSet.merge keeps deny-wins ACROSS
scopes (for the managed>project>user>agent layering in a later pass).

Tool→(action,target) mapping handles MassGen's surface: command_line→command,
filesystem/workspace_tools servers→read_file/write_file, other mcp__server__tool
→mcp(server/tool) (so e.g. linear/create_issue is not mis-typed as a write).

Live-verified: a deny rule blocks a low-risk echo; an allow rule permits a
high-risk curl (suppressing the risk-ask). 68 permission unit tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… round-trip)

Adds the interactive half of the per-tool 'ask' approval (the automation half
shipped in P0). When a real TUI is present, an 'ask' pops a ToolApprovalModal
(allow once/session/always | reject) instead of the automation policy:

- ToolApprovalModal (ModalScreen) + a pure, testable decision_for_action mapping.
- TextualTerminalDisplay.show_tool_approval_modal — mirrors show_change_review_modal
  (asyncio.Future + call_from_thread + call_soon_threadsafe + wait_for); fails CLOSED
  on no-app / error / 5-min timeout (a tool never proceeds without explicit approval).
- CoordinationUI._install_interactive_approval_provider swaps each guarded agent's
  ApprovalProvider to a modal-backed CallbackApprovalProvider; no-op in automation,
  for agents without a coordinator, or when the display lacks the modal method.

74 permission/modal unit tests; full interactive UX requires a real terminal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gate

Make the opt-in gate presence-based and unambiguous: OFF unless a 'permissions'
block is present and not explicitly disabled (absent key or 'permissions: false'
=> 100% unchanged; an empty/enabled dict opts in; 'enabled: false' opts out).
Adds test_permissions_optional.py proving: no permissions block => zero hooks
registered and no coordinator installed (default behavior untouched).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each agent already gets its own hook manager, so its permission engine is built
from ITS OWN config — naturally per-agent scoping (MassGen's multi-agent
differentiator). Adds role presets layered with per-agent rules (deny-wins):

  permissions: { enabled: true, role: read-only }   # researcher: no writes/shell
  permissions: { enabled: true, role: read-write,   # implementer: full access…
                 rules: { deny: ["command(git push --force*)"] } }  # …no force-push

role_rule_set() expands read-only/researcher → deny write_file(*)+command(*),
allow read_file(*), ask read_url(*); read-write/implementer → no preset. Roles
merge with per-agent rules via PermissionRuleSet.merge (deny-wins), so a role's
write-deny can't be loosened by a per-agent allow. Example config
per_agent_roles.yaml (researcher + implementer).

Live-verified: a read-only agent's write_file / command_line / delete are all
denied (no file written) while reads are allowed. 86 permission unit tests.

Note (follow-ups): a global/managed scope (orchestrator-level rules, deny-wins
over agent) needs orchestrator-config plumbing; flowing read-only to a per-agent
SRT profile (OS backstop) needs base.py→SrtManager plumbing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an opt-in application-layer permission engine (rules, risk classifier, hardline blocklist, approval providers, coordinator, ledger, persistence, budget, cache), TUI approval modal wiring, SRT read-only backstop, orchestrator/backend hook installer, configuration/examples, documentation and release notes, extensive tests, and bumps version to 0.1.97.

Changes

Permissions Engine & Release v0.1.97

Layer / File(s) Summary
Core models, rules, hardline
massgen/permissions/models.py, massgen/permissions/rules.py, massgen/permissions/hardline.py
Defines enums/dataclasses, parses/evaluates deny-wins action(target) rules, and adds an immutable hardline blocklist for catastrophic commands.
Risk classification
massgen/permissions/risk_classifier.py
Deterministic RiskClassifier tiers calls into LOW/MEDIUM/HIGH using tool hints and command regexes.
Approval providers
massgen/permissions/approval_provider.py
Adds PolicyApprovalProvider, FileApprovalProvider (JSON handshake), and CallbackApprovalProvider with fail-closed semantics.
Coordinator, ledger, budget, cache
massgen/permissions/coordinator.py, massgen/permissions/ledger.py, massgen/permissions/session_cache.py
PermissionCoordinator resolves ask→approval, audits to ApprovalLedger (JSONL), enforces ApprovalBudget runaway guard, and uses SessionApprovalCache for ONCE/SESSION grants.
Hooks & installer
massgen/permissions/hooks.py, massgen/orchestrator_collaborators/midstream_injection_hook_installer.py
Adds PRE_TOOL_USE HardlineBlocklistHook and PermissionEngineHook; installer wires per-agent rule sets, provider selection, ledger/budget/persistence, and installs coordinator on backends.
Persistence of ALWAYS grants
massgen/permissions/persistence.py
Persist ALWAYs grants into local settings JSON and load them back as PermissionRuleSet on startup.
Frontend TUI modal
massgen/frontend/displays/textual/..., massgen/frontend/coordination_ui.py
Textual approval modal + TextualTerminalDisplay method; CoordinationUI installs interactive CallbackApprovalProvider when interactive TUI present and not in automation.
Backend & SRT wiring
massgen/backend/_excluded_params.py, massgen/backend/base.py, massgen/filesystem_manager/_filesystem_manager.py, massgen/filesystem_manager/_srt_manager.py, massgen/backend/base_with_custom_tool_and_mcp.py
Excludes permissions config from provider forwards, maps read-only role to SRT read-only, threads command_line_srt_read_only into SrtManager to produce empty allowWrite, and adds coordinator setters and denied-call visibility helpers in backend execution path.
System prompt guardrails
massgen/system_prompt_sections.py, massgen/system_message_builder.py
Adds PermissionGuardrailSection and conditionally injects it into coordination prompts when gated.
Configs, docs, changelog, release notes, roadmap
massgen/configs/tools/permissions/*.yaml, CHANGELOG.md, RELEASE_NOTES_v0.1.97.md, README*.md, docs/*, ROADMAP*.md
New example configs, README/CHANGELOG/RELEASE_NOTES/ROADMAP updates, dev notes, followups, and announcements documenting the feature and release.
Tests
massgen/tests/*
Extensive new tests covering providers, coordinator, ledger, budget, hooks, rules, persistence, guardrail prompt, denied-call visibility, modal wiring, and SRT integration.
Version bump
massgen/__init__.py
Bumps version from 0.1.96 to 0.1.97.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • massgen/MassGen#774: Related work on the tool-execution chokepoint and PreToolUse hook path groundwork.
  • massgen/MassGen#1124: Prior SRT sandbox changes that this PR extends with permission-role wiring.
  • massgen/MassGen#1125: SRT sandbox wiring related to read-only behavior referenced by this PR.

Suggested reviewers

  • ncrispino
  • a5507203
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev/v0.1.97

ncrispino and others added 11 commits June 11, 2026 21:52
…kstop

P1.2 — headless/remote approval transport:
- FileApprovalProvider: request/response JSON handshake (req_<id>/resp_<id>),
  stable sha256 id (idempotent on resume), response consumed on read,
  timeout → fail-closed. Selected via permissions.approval_mode: file.
- Installer wires FileApprovalProvider when approval_mode == "file",
  else PolicyApprovalProvider (automation default unchanged).
- Interactive TUI swap now only overrides a PolicyApprovalProvider, so a
  configured file provider survives in interactive mode.

P1.3 — per-agent read-only role → OS-level SRT backstop:
- SrtManager(read_only=True) empties the EXECUTION profile's allowWrite
  (no writes reach disk even via shell escape), while the fs_tools profile
  keeps workspace writes so snapshots still work.
- FilesystemManager threads command_line_srt_read_only → SrtManager.
- base.py derives command_line_srt_read_only from permissions.role
  (read-only/researcher) so the app-layer rule and OS sandbox can't drift.

All opt-in; default behavior unchanged (no permissions block → no provider,
no read-only, no SRT). TDD: 135 permission+srt tests green; pre-commit clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
P2.1 — make the approval flow auditable and runaway-safe.

ApprovalLedger (massgen/permissions/ledger.py):
- Append-only JSONL, one line per approval-flow decision: run_id, agent_id,
  tool, normalized_pattern, risk, reason, decision (allow/deny), operator,
  scope, source (provider|cache), feedback, seq, timestamp.
- Crash-safe (flush per line), resumes seq numbering across restarts, never
  breaks the run on a write error, skips corrupt lines on read-back.

ApprovalBudget:
- Per-agent consecutive auto-approval counter (runaway-loop circuit breaker);
  trips once an agent exceeds the cap, reset on any human decision.

Wiring (opt-in):
- PermissionCoordinator gains an optional `ledger`; records every resolve_ask
  outcome including cache hits (source="cache"). No ledger → unchanged.
- Installer attaches an ApprovalLedger by default once permissions are enabled
  (under the approvals dir), disable with `permissions.audit: false`.

TDD: test_approval_ledger.py (12) + 2 installer wiring tests; full permission
suite green; pre-commit + all configs clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s, parity guard

Closes the gap between what P2.1 advertised and what was actually wired:

- ApprovalBudget: now connected to PermissionCoordinator (was dead code). Caps
  consecutive AUTO (policy/cache) approvals per agent and fails closed past the
  cap; a human decision resets the streak. Opt-in via `permissions.max_consecutive_auto`
  (absent -> unlimited, so long automation runs are unaffected).

- 'always' grant persistence (full loop): new massgen/permissions/persistence.py
  writes an approved call as a deduped allow(action(target)) rule to
  settings.local.json and loads it back as a lowest-precedence merged scope next
  run (deny-wins preserved). Makes the modal's "Always" button actually persist.
  On by default (human-gated); opt-out via `permissions.persist_approvals: false`.

- Backend parity guard: native backends (claude_code, codex) never run the
  framework PreToolUse chokepoint, so the engine was SILENTLY inert there. The
  installer now detects backends without the approval chokepoint, warns loudly
  that permissions are INACTIVE, and skips registering inert hooks (false
  confidence is worse than off). Documented MCP-family-only scope in the config.

Tested (TDD): 18 new tests (budget trip/reset/per-agent, persist<->load roundtrip
+ dedup + settings preservation, parity-guard warning) + live automation smoke
proving allow / deny-rule / ask->policy-deny + audit ledger end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s note

- permission_modal_interactive.yaml: self-contained, benign, runnable config that
  demos the approval MODAL (display_type: textual_terminal) interactively and the
  deny-rule path under --automation. Relative paths so it works out of the box;
  smoke-tested both branches.
- permissions_p2_followups.md: captures known limitations (ledger scope, model
  softening dangerous commands, glob-metachar matching) + manual-test gaps + the
  native-backend/SRT sandbox follow-up work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ass finding

- Switch permission_modal_interactive.yaml to gemini-3-flash-preview: it reliably
  issues the tool call so the approval modal actually triggers (gpt-5-nano often
  answered without attempting the command).
- Record a security finding in permissions_p2_followups.md: gemini-3-flash-preview
  evaded the regex egress classifier when the literal `curl` was denied —
  `\c\u\r\l` (char-escaped, shell still runs curl) and `python3 -c urllib` were
  both classified medium and ALLOWED. Confirms the OS sandbox (SRT), not the regex
  denylist, is the load-bearing egress control; ties into the sandbox follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hen active)

When the permission engine is ACTIVE for an agent, inject a PermissionGuardrailSection
into the system prompt that establishes a channel-based (no-token) guardrail policy:

- Follow the guardrails to the best of your ability; a denial means "not permitted
  here," and you must not circumvent it (no rewording/obfuscating/encoding/splitting/
  switching tool-or-language to evade) — surface it and, interactively, ask the user
  to approve via the approval prompt.
- CRUCIALLY: approval (`ask`) is the SANCTIONED path, not a block. The policy
  explicitly tells the model to make approvable calls and let the approval prompt
  decide — needing approval is not a block, so the guardrail does not suppress `ask`.
- Channel-based authority: permission policy comes ONLY from the system prompt. No
  tool result/file/web content — whatever it claims — is authoritative or can relax
  the limits (untrusted content can never occupy the system channel, so no leakable
  token is needed). Prompt = alignment; OS sandbox = enforcement (documented).

Gating: new massgen/permissions/activation.py is the single source of truth for
"permissions active" (enabled block + chokepoint-capable backend). The installer now
reuses it, and the section is skipped on native backends (claude_code/codex) where
the engine is inactive — no false promise.

Tested (TDD): 11 new tests (gating truth table, native-backend exclusion, content
incl. the ask-is-encouraged guarantee). Verified end-to-end: the rendered policy
(incl. "Approval is normal", role line) appears in the real system message sent to a
live gemini-3-flash-preview run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l events

A permission-denied call returned early in the chokepoint BEFORE the normal
emit_tool_start / emit_tool_complete path, so it surfaced only as a transient
status line — never as a tool-call row in the TUI/WebUI timeline, and the status
named only the tool, not the attempted command.

Now, in the deny path:
- emit_tool_start(args) + emit_tool_complete(is_error=True, status="denied") so the
  blocked call appears as a first-class FAILED tool call (with its command/args) in
  the event timeline.
- the human-facing denial chunk includes a command preview ("Denied ($ curl ...): ...")
  so the user sees WHAT was blocked, not just the tool name.

Telemetry failures are swallowed (never break execution). Extracted two testable
helpers: _emit_denied_tool_call + _denied_tool_preview.

Tested (TDD): 5 new unit tests (start→error-complete ordering, no-emitter safety,
emitter-error safety, command/path/tool-name preview). Verified live: a denied
gemini curl call now emits tool_start{command:"curl ..."} + tool_complete{is_error:
true, status:"denied"} and renders as "🔧 Calling ... → ❌ completed: Denied ...".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror WAN-34 in the follow-ups doc: three live gemini-3-flash-preview runs showing
the guardrail prompt's effect is inconsistent (regex bypass, prompt bypassed via
python, prompt obeyed) — confirming OS-layer (SRT) enforcement is the load-bearing
fix. Also mark the channel-based guardrail prompt + denied-tool-call visibility as
done.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bump version to 0.1.97; add CHANGELOG entry, release notes, announcement, and
update index.rst / README.md / configs README / ROADMAP for the permission-engine
release (opt-in rules + risk-tiered approval, audit ledger, roles, guardrail
prompt). Archive v0.1.96 announcement; rename next-release roadmap to v0.1.98
(carries the deferred image/video edit work).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat: layered opt-in permission system (P0–P2.2) — rules, approval, audit, guardrail prompt

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (8)
docs/dev_notes/permission_systems_research.md-99-109 (1)

99-109: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Label the fenced block.

The plain fence trips MD040; mark it as text (or another appropriate language) so the docs lint passes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dev_notes/permission_systems_research.md` around lines 99 - 109, The
fenced block in permission_systems_research.md is unlabeled which triggers
MD040; update the code fence by adding a language label such as text (e.g.,
```text) to the block containing the list of features (PreToolUse hooks /
GeneralHookManager, PatternHook, PPM._validate_command_tool, RiskClassifier,
SessionApprovalCache, ApprovalLedger, SRT OS sandbox, etc.) so the markdown
linter accepts it.

Source: Linters/SAST tools

RELEASE_NOTES_v0.1.97.md-50-50 (1)

50-50: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the PR reference.

This entry points at PR #1127, but the release metadata for this branch is PR #1126. If that link is intentional, ignore this; otherwise it will misattribute the release.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@RELEASE_NOTES_v0.1.97.md` at line 50, The release note line references the
wrong pull request number; update the PR link in the release entry "* feat:
layered opt-in permission system (P0–P2.2) — rules, approval, audit, guardrail
prompt by `@ncrispino` in https://github.com/massgen/MassGen/pull/1127" to the
correct PR `#1126` (i.e., replace /pull/1127 with /pull/1126) so the release
metadata correctly attributes the change.
RELEASE_NOTES_v0.1.97.md-5-49 (1)

5-49: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Flatten the heading hierarchy.

These sections jump from the title straight to ###, which triggers MD001. Use ## for the release subsections under the document title.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@RELEASE_NOTES_v0.1.97.md` around lines 5 - 49, The document uses level-3
headings (###) immediately under the document title which triggers MD001; update
the top-level release subsections to level-2 headings (change the leading "###"
to "##") for each major section such as "🛡️ Layered permission engine (hardline
→ rules → risk)", "✋ Approval that fits the run", "🧑‍🤝‍🧑 Roles, audit &
guards", "🧭 Guardrail-aware system prompt (channel-based)", "🔧 Also in this
release", and "📖 Getting Started" so they are proper subsections under the
document title while leaving deeper headings and code blocks unchanged.

Source: Linters/SAST tools

docs/dev_notes/permissions_p2_followups.md-47-54 (1)

47-54: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add blank lines around the table.

This table needs surrounding blank lines for markdownlint MD058.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dev_notes/permissions_p2_followups.md` around lines 47 - 54, Add a blank
line before the markdown table header "### Alignment ≠ enforcement — three live
`gemini-3-flash-preview` runs (all with permissions + guardrail prompt active)"
and a blank line after the table end (after the final closing row with `|
`log_20260612_091210` | 🟢 prompt obeyed | ... |`) so the table is separated
from surrounding paragraphs; this satisfies markdownlint MD058 without changing
the table content.

Source: Linters/SAST tools

massgen/tests/test_permissions_optional.py-76-76 (1)

76-76: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Resolve unused unpacked agent variables to satisfy Ruff.

These unpacked variables are never used; replace agent with _agent (or _) at each site to prevent lint noise/failures.

Suggested fix
-    agent, mgr = _install({"permissions": {"enabled": True, "role": "read-only"}})
+    _agent, mgr = _install({"permissions": {"enabled": True, "role": "read-only"}})

-    agent, mgr = _install({"permissions": {"enabled": True}})  # no role/rules
+    _agent, mgr = _install({"permissions": {"enabled": True}})  # no role/rules

-    agent, mgr = _install({"permissions": {"enabled": True, "approval_dir": str(tmp_path), "settings_path": str(settings)}})
+    _agent, mgr = _install({"permissions": {"enabled": True, "approval_dir": str(tmp_path), "settings_path": str(settings)}})

Also applies to: 84-84, 188-188

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/tests/test_permissions_optional.py` at line 76, The test unpacks a
tuple from _install into an unused variable agent and mgr; rename agent to
_agent (or _) in each call site where you do agent, mgr = _install(...) to
satisfy Ruff and avoid unused-variable warnings (search for occurrences of
"agent, mgr = _install" in test_permissions_optional.py and replace agent with
_agent at each occurrence, including the other two similar sites).

Source: Linters/SAST tools

massgen/backend/base_with_custom_tool_and_mcp.py-2783-2789 (1)

2783-2789: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use an explicit approval-pending status.

Line 2785 always emits status=None because ToolExecutionConfig has no status_running field. The text still shows up, but downstream displays/tests cannot treat "awaiting approval" as a structured state.

Proposed fix
                         yield StreamChunk(
                             type=config.chunk_type,
-                            status=getattr(config, "status_running", None),
+                            status="awaiting_approval",
                             content=f"⏸️ Awaiting approval for {tool_name}…",
                             source=f"{config.source_prefix}{tool_name}",
                             tool_call_id=call_id,
                         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/backend/base_with_custom_tool_and_mcp.py` around lines 2783 - 2789,
The code yields a StreamChunk with status set via getattr(config,
"status_running", None) which always becomes None because ToolExecutionConfig
has no status_running field; change the StreamChunk emission in the approval
branch to use an explicit approval-pending status (e.g., replace getattr(config,
"status_running", None) with a concrete value like "approval_pending" or the
appropriate enum constant such as ToolExecutionStatus.APPROVAL_PENDING) so
downstream consumers/tests can treat awaiting-approval as a structured state;
update the StreamChunk construction at the yield site (referencing StreamChunk,
config, tool_name, call_id, and config.source_prefix) to use that explicit
status.
massgen/frontend/displays/textual/widgets/modals/tool_approval_modal.py-52-57 (1)

52-57: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use an immutable BINDINGS container to avoid class-level mutability drift.

Line 52 defines BINDINGS as a list; this matches Ruff’s RUF012 warning and is safer as a tuple.

Suggested fix
-        BINDINGS = [
+        BINDINGS = (
             ("escape", "reject", "Reject"),
             ("1", "allow_once", "Allow once"),
             ("2", "allow_session", "Allow session"),
             ("a", "allow_always", "Always allow"),
-        ]
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/frontend/displays/textual/widgets/modals/tool_approval_modal.py`
around lines 52 - 57, The class-level BINDINGS is currently a mutable list which
can be accidentally mutated at runtime; change BINDINGS to an immutable tuple to
satisfy RUF012 and prevent class-level mutability drift—replace the list
assigned to BINDINGS in tool_approval_modal.py with a tuple (keep the same
entries and order) so the binding definitions remain immutable.

Source: Linters/SAST tools

massgen/frontend/displays/textual_terminal_display.py-2455-2458 (1)

2455-2458: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Log the exception when pop_screen fails on timeout.

The silent except: pass loses diagnostic information that could help debug edge cases (e.g., app already closed, screen stack empty). A debug-level log preserves fail-closed behavior while aiding troubleshooting.

Proposed fix
             try:
                 self._app.call_from_thread(self._app.pop_screen)
             except Exception:
-                pass
+                logger.debug("[ToolApproval] Failed to pop modal on timeout (may already be dismissed)")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/frontend/displays/textual_terminal_display.py` around lines 2455 -
2458, Replace the silent except in the try calling
self._app.call_from_thread(self._app.pop_screen) with logging of the exception
so failures on timeout are visible; catch Exception as e and log a debug-level
message with exc_info=True (e.g. using logging.getLogger(__name__) or the
module/class logger) that includes context like "Failed to pop_screen via
call_from_thread" and the exception, and add/import the logger if missing.

Source: Linters/SAST tools

🧹 Nitpick comments (18)
massgen/configs/tools/permissions/per_agent_roles.yaml (2)

20-21: 💤 Low value

Prefer cost-effective models in example configs.

The coding guidelines recommend using cost-effective models like gpt-5-nano, gpt-5-mini, or gemini-2.5-flash in example configurations. Consider switching from gpt-5 to a more economical option so users copying this config don't incur unnecessary costs.

💡 Suggested diff
     backend:
       type: "openai"
-      model: "gpt-5"
+      model: "gpt-5-mini"
       cwd: "workspace_researcher"
...
     backend:
       type: "openai"
-      model: "gpt-5"
+      model: "gpt-5-mini"
       cwd: "workspace_implementer"

Also applies to: 29-31

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/configs/tools/permissions/per_agent_roles.yaml` around lines 20 - 21,
Update the example config to use a cost-effective model instead of "gpt-5":
locate the "model" keys in the per_agent_roles.yaml entries (e.g., the block
where type: "openai" and model: "gpt-5" appear) and replace "gpt-5" with a
cheaper recommended option such as "gpt-5-mini" or "gpt-5-nano"; apply the same
replacement to the other occurrence noted around lines 29-31 so all example
model entries use a cost-effective variant.

Source: Coding guidelines


20-31: 💤 Low value

Prefer cost-effective models in example configs.

Both per_agent_roles.yaml and permission_engine.yaml use gpt-5, which is an expensive model. Per coding guidelines, example configurations should prefer cost-effective models like gpt-5-nano, gpt-5-mini, or gemini-2.5-flash to avoid unnecessary costs when users copy these configs. Consider switching to gpt-5-mini in both files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/configs/tools/permissions/per_agent_roles.yaml` around lines 20 - 31,
Replace the expensive "gpt-5" model entries with a cost-effective default (e.g.,
"gpt-5-mini") in the example configs: find occurrences where model: "gpt-5"
under OpenAI backends (look for backend.type: "openai" and model keys and the
top-level model: "gpt-5" in per_agent_roles.yaml and permission_engine.yaml) and
change them to model: "gpt-5-mini" (or another recommended low-cost model such
as "gpt-5-nano" or "gemini-2.5-flash") so example configs use a cheaper default
when copied.

Source: Coding guidelines

massgen/orchestrator_collaborators/midstream_injection_hook_installer.py (1)

369-369: 💤 Low value

Remove redundant import of HookType.

HookType is already imported at module level (line 29). This inline import is unnecessary.

♻️ Suggested fix
-        from massgen.mcp_tools.hooks import HookType
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/orchestrator_collaborators/midstream_injection_hook_installer.py` at
line 369, The inline import "from massgen.mcp_tools.hooks import HookType" in
midstream_injection_hook_installer.py is redundant because HookType is already
imported at module scope; remove this local import statement to avoid
duplication and potential shadowing, leaving the existing top-level HookType
import as the single source of that symbol.
massgen/system_prompt_sections.py (1)

1953-1961: ⚡ Quick win

Add Google-style docstrings to the new methods.

__init__() and build_content() are newly added in this changed surface, but neither has a Google-style docstring. That makes the prompt contract harder to inspect from generated docs and IDE help. As per coding guidelines, "For new or changed functions, include Google-style docstrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/system_prompt_sections.py` around lines 1953 - 1961, The new methods
__init__ and build_content lack Google-style docstrings; add concise
Google-style docstrings to both methods in the Permission Guardrails section:
for __init__(self, role: str | None = None) document the purpose of the
constructor, describe the role parameter and its type/behavior and any
side-effects (e.g., setting self.role), and for build_content(self) document
what the method returns (str), what content it builds, and any important
invariants; place these docstrings immediately beneath each method signature
following Google docstring conventions.

Source: Coding guidelines

massgen/tests/test_permissions_optional.py (1)

11-202: 🏗️ Heavy lift

New functions need Google-style docstrings for compliance.

The newly added helper/test functions should include concise Google-style docstrings to match repository standards.

As per coding guidelines, **/*.py: “For new or changed functions, include Google-style docstrings”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/tests/test_permissions_optional.py` around lines 11 - 202, Add
concise Google-style docstrings to the newly introduced helper and test
functions so they comply with the repo standard: add a one-line summary and
brief Args/Returns sections where applicable for _Manager, _Backend, _Agent,
_install, _engine and each new test function (e.g.,
test_no_permissions_block_is_a_total_noop,
test_permissions_enabled_false_is_a_noop, etc.), placing the docstring
immediately below the def line; for helpers include parameter descriptions
(config, mgr, agent) and return types (tuple or object), and for tests add a
short description of the scenario being asserted.

Source: Coding guidelines

massgen/tests/test_permission_guardrail_prompt.py (1)

26-133: 🏗️ Heavy lift

Add Google-style docstrings to new test/helper functions.

This new module adds multiple functions without docstrings; please add concise Google-style docstrings consistently to each new function for guideline compliance.

As per coding guidelines, **/*.py: “For new or changed functions, include Google-style docstrings”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/tests/test_permission_guardrail_prompt.py` around lines 26 - 133, Add
concise Google-style docstrings to each new test and helper in this file:
include a one-line summary for every test function (e.g.,
test_is_permissions_enabled_truth_table,
test_active_when_enabled_and_chokepoint_backend,
test_inactive_when_no_permissions_block, test_inactive_when_disabled,
test_inactive_on_native_backend_even_with_permissions,
test_section_states_the_core_policy,
test_section_encourages_the_approval_flow_not_just_blocks,
test_section_mentions_role_when_provided,
test_section_omits_role_line_when_absent,
test_prompt_includes_guardrails_when_active,
test_prompt_excludes_guardrails_when_inactive) and for helpers/classes
(_ChokepointBackend, _ChokepointBackend.__init__,
_ChokepointBackend.set_permission_coordinator, _NativeBackend,
_NativeBackend.__init__, _render_with_gate, PermissionGuardrailSection usage)
provide a short summary plus Args and Returns sections where relevant (e.g.,
describe the backend/config param and returned string for _render_with_gate);
keep docstrings brief and Google-style consistent across all functions.

Source: Coding guidelines

massgen/backend/base_with_custom_tool_and_mcp.py (1)

571-582: ⚡ Quick win

Convert the new permission helpers to Google-style docstrings.

The added/changed helpers here use free-form prose docstrings, so autogenerated docs lose Args/Returns details. Please convert these new methods to the repo's required Google-style format.

As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings".

Also applies to: 2501-2550

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/backend/base_with_custom_tool_and_mcp.py` around lines 571 - 582,
Convert the free-form docstrings on set_permission_coordinator and
set_approval_provider to Google-style docstrings: for set_permission_coordinator
add an Args section documenting coordinator (type Any) and describe behavior
when unset, and add a Returns: None line; for set_approval_provider add an Args
section documenting provider (type Any) and mention that it assigns provider to
self._permission_coordinator.provider only when self._permission_coordinator is
not None, and add Returns: None. Keep descriptions concise and follow the repo's
Google-style formatting used elsewhere (Args:, Returns:).

Source: Coding guidelines

massgen/filesystem_manager/_filesystem_manager.py (1)

179-179: ⚡ Quick win

Document the new SRT read-only knobs in both constructor docstrings.

FilesystemManager.__init__ and SrtManager.__init__ both changed to expose new read-only behavior, but neither constructor now gives callers a complete Google-style explanation of the new inputs and their contract. Please document command_line_srt_read_only in massgen/filesystem_manager/_filesystem_manager.py and add/update the initializer docstring in massgen/filesystem_manager/_srt_manager.py so the execution-vs-fs_tools behavior is discoverable where these APIs are wired.

As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/filesystem_manager/_filesystem_manager.py` at line 179, Add a
Google-style docstring entry for the new boolean parameter
command_line_srt_read_only in FilesystemManager.__init__ (in
massgen/filesystem_manager/_filesystem_manager.py) and update the initializer
docstring in SrtManager.__init__ (in massgen/filesystem_manager/_srt_manager.py)
to clearly describe the parameter, its default, and the contract: when True the
SRT will be treated read-only for command-line/execution paths vs fs_tools
mutations, what callers can expect (no writes via execution), and any
implications for error handling or fallback behavior; ensure parameter
description follows the project’s Google-style format and mentions how
execution-vs-fs_tools behavior is wired.

Source: Coding guidelines

massgen/backend/base.py (1)

229-233: ⚡ Quick win

Add a regression test for the role-to-SRT mapping at this boundary.

The new tests prove command_line_srt_read_only=True reaches SRT, but they bypass the new branch here. If permissions["role"] stops mapping to that flag, the current suite would still pass. Please add one backend-level test that constructs a minimal LLMBackend subclass with permissions={"role": "read-only"} (and "researcher") and asserts the generated execution settings have an empty allowWrite.

As per coding guidelines, "After implementing any feature that involves passing parameters through multiple layers ... verify the full wiring chain end-to-end by tracing the parameter from its origin to its final usage site."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/backend/base.py` around lines 229 - 233, Add a backend-level
regression test that verifies the role→SRT mapping: create a minimal LLMBackend
subclass (use LLMBackend) and call the path that builds execution settings with
permissions={"role": "read-only"} and permissions={"role": "researcher"}, then
assert the produced execution settings include command_line_srt_read_only
behavior by checking allowWrite is an empty set/collection in the final
execution settings (the value that SRT consumes). Focus the test on exercising
the full wiring from the provided permissions through the code path that sets
command_line_srt_read_only and results in allowWrite, so changes to the mapping
will cause the test to fail.

Source: Coding guidelines

massgen/tests/test_permission_denied_tool_visibility.py (1)

20-79: ⚡ Quick win

Exercise the real deny path, not only the emission helper.

These tests pin _emit_denied_tool_call() itself, but they don't prove the permission chokepoint still invokes it when execution is denied. A wiring regression in CustomToolAndMCPBackend would leave the UI blind while this file stays green. Please add one integration-style test that drives a denied tool call through the backend's actual execution path and asserts the same start/error-complete event sequence.

As per coding guidelines, "After implementing any feature that involves passing parameters through multiple layers ... verify the full wiring chain end-to-end by tracing the parameter from its origin to its final usage site."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/tests/test_permission_denied_tool_visibility.py` around lines 20 -
79, Add an integration-style test that ensures the backend's real execution path
invokes CustomToolAndMCPBackend._emit_denied_tool_call when a tool is denied:
monkeypatch get_event_emitter to a FakeEmitter (same as existing tests),
monkeypatch the backend's permission-check hook used by the execution entrypoint
to force a denial, then call the backend's public execution entrypoint on
CustomToolAndMCPBackend (the method that normally starts tool execution) with a
stub agent and the same call_id/tool_name/args as the unit tests, and finally
assert the emitted events sequence is ["start","complete"] and that the complete
event has is_error True and status "denied" and contains the denial reason
(reuse assertions from existing tests). Ensure the test cleans up monkeypatches
and does not call _emit_denied_tool_call directly but exercises the end-to-end
wiring.

Source: Coding guidelines

massgen/frontend/displays/textual/widgets/modals/tool_approval_modal.py (1)

59-91: ⚡ Quick win

Add Google-style docstrings to new modal methods.

Lines 59-91 introduce/modify functions without Google-style docstrings, which violates the repository Python guideline.

As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/frontend/displays/textual/widgets/modals/tool_approval_modal.py`
around lines 59 - 91, Add Google-style docstrings to each new/modified method in
the modal class: __init__, compose, on_button_pressed, action_reject,
action_allow_once, action_allow_session, and action_allow_always. For each
method provide a one-line summary, a blank line, and sections for Args (list
parameters and types: e.g., authz: AuthorizationObject, event: Button.Pressed
where applicable), Returns (e.g., None) and any Raises if relevant; for compose
mention it yields a ComposeResult and describe what UI it constructs. Ensure the
docstrings follow Google style and are placed immediately under each method
signature.

Source: Coding guidelines

massgen/tests/test_tool_approval_modal.py (1)

72-110: ⚡ Quick win

Cover the modal-exception path, not just the provider swap.

These tests prove the replacement happens, but they never invoke the installed callback. The fail-closed behavior in CoordinationUI._install_interactive_approval_provider() is the security-critical branch here; add a test where show_tool_approval_modal() raises and assert the resulting decision is denied. As per coding guidelines, "TDD is the default development methodology for this project" and "Cover edge cases, error conditions, and boundary values."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/tests/test_tool_approval_modal.py` around lines 72 - 110, Add a test
that covers the modal-exception path by installing the interactive provider via
CoordinationUI._install_interactive_approval_provider (use the existing
_ui/_Disp/_Orch/_Agent fixtures) but make the display's show_tool_approval_modal
raise an exception; then retrieve the installed CallbackApprovalProvider from
agent.backend._permission_coordinator.provider and invoke its callback (or
simulate an approval request) and assert the result is a denied decision
(fail-closed). Ensure the test also asserts the provider was installed as
CallbackApprovalProvider before calling the callback and that an agent with no
coordinator or automation_mode behavior remains unchanged where appropriate.

Source: Coding guidelines

massgen/frontend/coordination_ui.py (1)

815-818: ⚡ Quick win

Add Google-style docstrings to the new functions in massgen/frontend/coordination_ui.py and massgen/tests/test_tool_approval_modal.py.

CoordinationUI._install_interactive_approval_provider() in massgen/frontend/coordination_ui.py and the new helper/test functions in massgen/tests/test_tool_approval_modal.py are missing the docstring format required for changed Python functions in this repo. As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings`".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/frontend/coordination_ui.py` around lines 815 - 818, Add Google-style
docstrings to the changed/added Python functions: at minimum add one to
CoordinationUI._install_interactive_approval_provider in
massgen/frontend/coordination_ui.py and to each new helper/test function in
massgen/tests/test_tool_approval_modal.py; each docstring should start with a
one-line summary, a short description if needed, and include appropriate
sections like Args (with parameter names and types), Returns (type and
description) and Raises (if the function can raise), matching the project's
Google-style docstring format so linters/pass checks will accept them.

Source: Coding guidelines

massgen/permissions/approval_provider.py (1)

27-31: ⚡ Quick win

Add Google-style docstrings to the new permission helpers.

The new functions/methods in massgen/permissions/approval_provider.py and massgen/permissions/persistence.py have brief inline docstrings, but they don't follow the repository's Google-style requirement for changed Python functions. Bringing these up to the project standard will make the approval/persistence contracts easier to maintain.

As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/permissions/approval_provider.py` around lines 27 - 31, Replace the
brief inline docstrings for the new permission helpers with Google-style
docstrings: update ApprovalProvider.request_approval (and the new
functions/methods added in persistence.py) to include a short summary line, an
Args section documenting the AuthorizationObject parameter, a Returns section
describing the ApprovalDecision return type, and a Raises section if the method
can raise exceptions; ensure the docstrings follow the repository's Google-style
format and cover parameter types and return semantics so callers and
implementers can rely on the contract.

Source: Coding guidelines

massgen/permissions/activation.py (1)

15-35: ⚡ Quick win

Please convert the new public docstrings to Google style.

These new helpers/classes mostly use short narrative docstrings, but this repo requires Google-style docstrings on changed Python functions. Adding Args:, Returns:, and Raises: where applicable will also keep the generated API docs consistent across the new permissions modules.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/permissions/activation.py` around lines 15 - 35, Update the two
public functions is_permissions_enabled and guardrail_prompt_active to use
Google-style docstrings: replace the narrative triple-quote text with a
docstring that includes an Args: section describing the parameter (perms: Any or
backend: Any), a Returns: section stating the bool return value and meaning, and
a Raises: section if applicable (none here, so you can omit Raises or state
"None"); keep descriptions concise and preserve existing behavior and examples
from the original docstrings so API docs remain consistent.

Source: Coding guidelines

massgen/permissions/session_cache.py (1)

14-37: ⚡ Quick win

Add Google-style docstrings to the new cache API.

SessionApprovalCache is a new production primitive, but the class and most of its methods are undocumented. That breaks the repo's Python docstring rule and leaves this approval-path behavior under-specified for future callers.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/permissions/session_cache.py` around lines 14 - 37, Add Google-style
docstrings for the new SessionApprovalCache class and its public methods to
satisfy the repo rule; document the class purpose, the constructor (__init__),
and each method: key_for(agent_id, tool, normalized_pattern) (describe returned
key format), grant(key, scope) (explain that ALWAYS is persisted by caller but
still cached), check(key) (state that it returns bool, consumes ONCE grants by
deleting them), and clear() (clears in-memory grants). Include Args/Returns
sections and mention GrantScope semantics where relevant so callers understand
behavior.

Source: Coding guidelines

massgen/permissions/coordinator.py (1)

24-130: ⚡ Quick win

Add Google-style docstrings to the coordinator entry points.

This file introduces the coordinator that owns the approval flow, but the new class and most of its methods are missing the required function docstrings. That leaves a central control path under-documented for future maintenance.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/permissions/coordinator.py` around lines 24 - 130, Add Google-style
docstrings to the PermissionCoordinator class and its public/important methods
(__init__, resolve_ask, _budget_tripped, _audit, and the static _preview)
describing purpose, args, return types, and raised exceptions (if any). For
__init__ document each parameter (provider, cache, risk_classifier,
persist_always, ledger, budget) and their types/behavior; for resolve_ask
describe the parameters (agent_id, tool, arguments, reason), the return tuple
(allowed: bool, feedback: str|None), side-effects (cache, ledger, budget,
persist_always) and when it queries provider vs cache; for _budget_tripped and
_audit describe inputs and effects on ledger/auditing; for _preview state what
it returns and how it derives the preview. Keep wording concise and follow
Google-style format (Args:, Returns:, Raises:).

Source: Coding guidelines

massgen/permissions/ledger.py (1)

28-116: ⚡ Quick win

Document the new ledger and budget APIs with Google-style docstrings.

The file adds new production types and methods, but most of the callable surface still relies on comments rather than required function docstrings. That makes the audit/budget contracts harder to consume and falls short of the repo's Python standard.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@massgen/permissions/ledger.py` around lines 28 - 116, The new ApprovalLedger
and ApprovalBudget types lack Google-style docstrings; add concise docstrings
for the classes and each public method to match repo standards: for
ApprovalLedger add class docstring and method docstrings for __init__ (params:
path, run_id), record (params: authz, allowed, operator, scope, source,
feedback; returns: dict entry; notes about append-only file, seq increment,
error handling), entries (returns: list of dict; skips corrupt lines), and for
ApprovalBudget add class docstring and method docstrings for __init__ (param:
max_consecutive_auto), check_auto (param: agent_id; returns: bool; side-effect:
increments counter), and reset (param: agent_id; side-effect: clears counter).
Ensure each docstring follows Google style with Args, Returns, and Raises where
applicable and mentions types/behaviour (e.g., idempotent resume of seq, file
encoding, that IO errors are logged not raised).

Source: Coding guidelines


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ce5f008e-bd50-4f5e-a580-2dc042de973b

📥 Commits

Reviewing files that changed from the base of the PR and between 01674a4 and c4e829f.

📒 Files selected for processing (52)
  • CHANGELOG.md
  • README.md
  • README_PYPI.md
  • RELEASE_NOTES_v0.1.97.md
  • ROADMAP.md
  • ROADMAP_v0.1.98.md
  • docs/announcements/archive/v0.1.96.md
  • docs/announcements/current-release.md
  • docs/dev_notes/permission_systems_research.md
  • docs/dev_notes/permissions_p2_followups.md
  • docs/source/index.rst
  • massgen/backend/_excluded_params.py
  • massgen/backend/base.py
  • massgen/backend/base_with_custom_tool_and_mcp.py
  • massgen/configs/README.md
  • massgen/configs/tools/permissions/per_agent_roles.yaml
  • massgen/configs/tools/permissions/permission_engine.yaml
  • massgen/configs/tools/permissions/permission_modal_interactive.yaml
  • massgen/filesystem_manager/_filesystem_manager.py
  • massgen/filesystem_manager/_srt_manager.py
  • massgen/frontend/coordination_ui.py
  • massgen/frontend/displays/textual/widgets/modals/tool_approval_modal.py
  • massgen/frontend/displays/textual_terminal_display.py
  • massgen/orchestrator_collaborators/midstream_injection_hook_installer.py
  • massgen/permissions/__init__.py
  • massgen/permissions/activation.py
  • massgen/permissions/approval_provider.py
  • massgen/permissions/coordinator.py
  • massgen/permissions/hardline.py
  • massgen/permissions/hooks.py
  • massgen/permissions/ledger.py
  • massgen/permissions/models.py
  • massgen/permissions/persistence.py
  • massgen/permissions/risk_classifier.py
  • massgen/permissions/rules.py
  • massgen/permissions/session_cache.py
  • massgen/system_message_builder.py
  • massgen/system_prompt_sections.py
  • massgen/tests/test_approval_ledger.py
  • massgen/tests/test_approval_provider.py
  • massgen/tests/test_file_approval_provider.py
  • massgen/tests/test_permission_coordinator.py
  • massgen/tests/test_permission_denied_tool_visibility.py
  • massgen/tests/test_permission_guardrail_prompt.py
  • massgen/tests/test_permission_hooks.py
  • massgen/tests/test_permission_persistence.py
  • massgen/tests/test_permission_rules.py
  • massgen/tests/test_permissions_core.py
  • massgen/tests/test_permissions_optional.py
  • massgen/tests/test_srt_filesystem_integration.py
  • massgen/tests/test_srt_manager.py
  • massgen/tests/test_tool_approval_modal.py
✅ Files skipped from review due to trivial changes (8)
  • ROADMAP_v0.1.98.md
  • massgen/permissions/init.py
  • docs/announcements/archive/v0.1.96.md
  • massgen/permissions/hardline.py
  • massgen/configs/README.md
  • docs/announcements/current-release.md
  • README.md
  • ROADMAP.md

Comment thread massgen/backend/base_with_custom_tool_and_mcp.py
Comment thread massgen/backend/base_with_custom_tool_and_mcp.py
Comment thread massgen/frontend/coordination_ui.py
Comment thread massgen/frontend/coordination_ui.py
Comment thread massgen/permissions/approval_provider.py
Comment thread massgen/permissions/persistence.py
Comment thread massgen/permissions/risk_classifier.py
Comment thread massgen/permissions/rules.py
Comment thread massgen/permissions/rules.py
Comment thread massgen/system_prompt_sections.py
…ing yet)

The live in-TUI modal auto-swap doesn't fire (the provider swap runs before the
permission coordinator exists), so an interactive `ask` falls back to the
automation policy. Per release scope, remove all interactive-modal claims from the
v0.1.97 docs (CHANGELOG, README, configs README, index.rst, release notes,
announcement) and delete the now-pointless permission_modal_interactive.yaml demo.
v0.1.97 advertises only the live-verified approval paths: automation policy + file
handshake. Root cause + fix options recorded in permissions_p2_followups.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ncrispino

Copy link
Copy Markdown
Collaborator

@coderabbitai ignore

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
✅ Action performed

Reviews paused.

docs(release): drop interactive approval modal from v0.1.97 (not working yet)
@Henry-811 Henry-811 merged commit 007bd85 into main Jun 12, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants