Add a deslopification gate for finalized manuscripts by davidturturean · Pull Request #256 · psi-oss/get-physics-done

davidturturean · 2026-06-04T21:54:44Z

What this adds

A deslopification gate for GPD. It takes a finalized AI-written math or physics manuscript and makes it read like expert work without changing the science, with a by-line audit, and with substantive issues flagged rather than rewritten.

How it works

The deterministic core, src/gpd/core/deslopification.py, extracts a manuscript's protected spans (math, \cite keys, in-math numbers, theorem status), detects AI and agent writing tells with line locations, and provides an invariant checker. The checker re-extracts the protected spans after a proposed edit and rejects the edit if any of them changed, so a prose rewrite cannot alter the science undetected.

The gpd-discipline-editor agent performs the nuanced rewrites, and every edit it proposes is verified by the same checker. The deterministic apply mode lands only the edits it can prove safe and writes the by-line audit. Anything substantive, such as a placeholder citation or undefined notation, is flagged rather than rewritten, since rewriting plausible-but-wrong content would remove the signals that warn a reviewer.

Commands

gpd deslop scan <manuscript> --mode audit|apply|ci
gpd deslop check <before> <after>
gpd validate deslop-invariants <before> <after>
gpd:deslop-paper [manuscript] [--mode audit|apply|ci]

Where it runs

The gate runs after the reward-hacking integrity gate and before peer review in the write-paper publication-review finalization, and as a standalone command. gpd:arxiv-submission blocks on unresolved deslopification release blockers.

Tests

tests/core/test_deslopification.py and tests/test_deslop_cli_commands.py cover span extraction, tell detection, the invariant checker (including that it rejects a planted change to a math span, a citation, a number, or theorem status), and the CLI. The existing CLI validation tests still pass.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added manuscript deslopification tools: deslop scan identifies AI language and release blockers; deslop check verifies that text edits preserve scientific content
- Integrated deslopification gate into publication workflow, generating audit trails and flagged issues
- Added validate deslop-invariants command for invariant verification
Documentation
- Added specifications for deslopification workflows, audit schemas, and flag definitions
Tests
- Added comprehensive test coverage for deslopification engine and CLI commands

A meaning-preserving pass that strips public-facing AI and agent writing tells from a manuscript while freezing all scientific content, emits a by-line audit, and flags substantive issues instead of rewriting them. Deterministic core (core/deslopification.py): protected-span extraction, located tell detection, and an invariant checker that rejects any edit which changes a math span, citation key, in-math number, or theorem status. The gpd-discipline-editor agent does the nuanced rewrites, each gated by the same checker. Adds gpd deslop scan/check, gpd validate deslop-invariants, and the gpd:deslop-paper command; wires the gate into write-paper finalization and the arxiv-submission blocking conditions. 11 tests; existing CLI tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

CLAassistant · 2026-06-04T21:54:54Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-06-04T21:54:58Z

📝 Walkthrough

Walkthrough

This PR introduces a complete deslopification system that removes public-facing AI/agent artifacts from academic manuscripts while preserving scientific content through frozen invariants. The implementation includes a deterministic engine, detailed specifications, command orchestration, CLI integration, and comprehensive test coverage.

Changes

Deslopification Gate Feature

Layer / File(s)	Summary
Deslopification Engine Implementation and Tests `src/gpd/core/deslopification.py`, `tests/core/test_deslopification.py`	Core module implementing protected-span extraction (math, citations, status tokens), tell detection with line/column locations and routing (KEEP/EDIT/FLAG), invariant checking via multiset diffing, and deterministic regex-driven style edits (only when invariants pass). Generates audit/flag/summary artifacts. Tests validate tell detection, invariant enforcement across prose-only edits and protected-span drifts, and manuscript scanning with placeholder blocking.
Gate Specification and Artifact Schemas `src/gpd/specs/references/publication/deslopification-gate.md`, `src/gpd/specs/templates/paper/deslop-audit-schema.md`, `src/gpd/specs/templates/paper/deslop-flags-schema.md`	Defines semantics of deslopification gate with KEEP/EDIT/FLAG routing, invariant-check rules forbidding new claims and protected-span mutation, and mandatory audit/flags/summary artifact emission with fail-closed behavior. Specifies JSONL audit records (per-edit with `edit_id`, `location`, `original`, `new`, `tell_addressed`, `rationale`) rendered as human Markdown tables, and flags schema with severity/category enums and release-blocker controls.
Agent and Command Specifications `src/gpd/agents/gpd-discipline-editor.md`, `src/gpd/commands/deslop-paper.md`	Agent spec defines owned artifacts, four-pass control flow (freeze spans, route, apply invariant-gated edits, emit artifacts), and critical rules (no invention, flag substantive issues, every edit audited). Command spec documents `audit`/`apply`/`ci` modes, `--strict` behavior, initialization from `gpd --raw init`, delegation to agent with four-pass prompt, and success criteria including clean gate status and zero release blockers.
CLI Commands and Testing `src/gpd/cli.py`, `tests/test_deslop_cli_commands.py`	Adds `deslop_app` command group with `scan` (runs detection/edits, write artifacts), `check` (validates invariants, exit code 2 on failure), and `validate deslop-invariants` alias delegating to `check`. Tests invoke CLI against fixture `.tex` files and assert exit codes, gate status, blocker detection, and invariant rejection for science-span drift.
Workflow Integration and Submission Policy `src/gpd/commands/arxiv-submission.md`, `src/gpd/specs/workflows/write-paper/publication-review-finalization.md`	Integrates deslopification gate into `publication-review-finalization` workflow between integrity gating and `pre_submission_review`. Gate calls `gpd-discipline-editor` in `apply` mode, writes audit/flag/summary artifacts, consumes `DESLOP-SUMMARY.json` to decide peer-review blocking, logs outcomes to `CRITIQUE-LOG.md`, and handles release blockers via autonomy-based re-run or user decision. Updates `arxiv-submission` to treat unresolved blockers as submission blockers with remediation command.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

madeleinesong
cmaloney111

Poem

🐰 A Gate to Guard the Prose

With frozen spans and tells to find,
The deslop-editor rewinds,
Each math and cite held crystal-clear,
While prose alone may disappear.
No claims shall slip past audit's eye—
Just style and taste, preserved and dry! 🎀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add a deslopification gate for finalized manuscripts' clearly summarizes the main objective of the PR—implementing a deslopification gate—and is concise and specific.
Description check	✅ Passed	The PR description comprehensively covers what changed, why it was added, how it works, the CLI commands, integration points, and testing. It follows the structure expected in the template with dedicated sections for changes, motivation, and testing, exceeding the basic requirements.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

marcpickett1 · 2026-06-04T22:04:29Z

🤖 RoastBot: You wrote a feature to detect slop... in an AI tool... using an AI. The ouroboros of self-loathing is complete.

coderabbitai

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/gpd/agents/gpd-discipline-editor.md`:
- Around line 64-65: The documentation requires DESLOP-AUDIT.jsonl and
DESLOP-AUDIT.md to be emitted unconditionally, but the current implementation in
src/gpd/core/deslopification.py only writes them inside the apply-mode block
(lines ~321-339) after invariant checks; reconcile by extracting the
audit-emission logic into a shared routine (e.g., emit_deslop_audit or
write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure
it sets meaning_preserving: yes and writes one record per edit), while keeping
the edited .tex write only in apply() and preserving the invariants behavior (or
log invariant failures while still emitting the audit artifacts in audit mode if
that matches the contract).

In `@src/gpd/cli.py`:
- Around line 9552-9566: The deslop_check handler currently coerces unreadable
files to empty strings and derives is_tex from the raw argument, which can allow
missing inputs to slip past checks; instead, resolve each path with
_resolve_path_from_effective_cwd, call safe_read_file and if it returns None
treat that as a hard error (emit a clear message via typer.echo/process logger
and exit with a non-zero code) rather than substituting "", then pass the actual
file contents to check_invariants; also determine is_tex from the resolved
path's suffix in a case-insensitive way (e.g., resolved_path.suffix.lower() ==
".tex") before calling check_invariants.

In `@src/gpd/commands/deslop-paper.md`:
- Around line 33-34: The docs claim gpd:deslop-paper --strict fails on any
unresolved FLAG but the code only checks gate_status/release_blocker_count; fix
by wiring --strict end-to-end: add a --strict boolean to the gpd:deslop-paper
CLI handler and pass it into the deslopification scan flow (propagate through
the function/method that performs deslopification), compute
unresolved_flags_count (count of unresolved non-blocker flags) during the scan
alongside release_blocker_count and gate_status, and update the final exit logic
to return non-zero when --strict is true and unresolved_flags_count > 0 (while
preserving existing behavior when --strict is false). Alternatively, if you
prefer not to change behavior, update the doc snippets to remove/clarify
--strict and state that only release_blocker_count/gate_status cause non-zero
exit; make changes in the gpd:deslop-paper docs and any workflow wiring that
currently gates on release_blocker_count to reflect the chosen semantics.

In `@src/gpd/core/deslopification.py`:
- Around line 78-83: The invariant checker misses TeX display math delimited by
$$...$$ because extract_protected_spans never records that pattern; add a regex
for dollar-delimited display math (e.g., a new _DOLLAR_DISPLAY_MATH =
re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)) and ensure
extract_protected_spans treats _DOLLAR_DISPLAY_MATH the same as
_DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and excluded
from the change comparison; update any other places referencing _DISPLAY_MATH
(and the similar patterns around lines where _PAREN_MATH/_DISPLAY_MATH/_ENV_MATH
are defined) so the new symbol is included in the protected-span extraction
logic.
- Around line 307-310: The code currently treats read failures as empty
manuscripts by using `_safe_read(path) or ""`; change this to fail hard: call
`_safe_read(path)`, check if the result is None (or an explicit error sentinel)
and if so raise an appropriate exception (e.g., FileNotFoundError or IOError)
with the path and underlying error info instead of substituting ""; apply the
same fix to the other occurrence that uses `_safe_read` (the similar block
around lines 385-388) so both `scan_manuscript` and the duplicate reader call
stop returning false-clean results for unreadable files.
- Around line 32-38: The current broad "except Exception" around importing
_CITATION_RE, _PLACEHOLDER_RE, and _strip_latex_comments from
gpd.core.arxiv_package (and the similar block at lines ~58-60) masks runtime
errors inside those modules; change the handlers to only catch
ImportError/ModuleNotFoundError (e.g., "except ImportError:" or "except
(ImportError, ModuleNotFoundError) as e:") and let any other Exception propagate
(or re-raise) so in-repo failures fail fast instead of silently falling back to
demo replacements.

In `@src/gpd/specs/references/publication/deslopification-gate.md`:
- Around line 91-105: Fix the markup in the CONCEPT_INTRO_GATE block: correct
the broken table/pipe syntax (ensure rows use consistent | column separators or
remove the stray table delimiter around the `<concept_intro_gate>` section), add
a language identifier to the fenced code block (e.g., ```yaml) to satisfy
fenced-code linting, and replace the invalid token `true_is_forbidden` with
valid JSON/YAML boolean or a quoted string (e.g., true or "true_is_forbidden")
so schema readers can parse the gate definition; refer to the CONCEPT_INTRO_GATE
block and the fenced code block shown in the diff when making these edits.
- Around line 126-129: The spec says `ci` should block for missing audit
coverage and unresolved notation/concept-order flags, but the gate_status is
only set to "blocked" when a finding is marked release_blocker; update the
gate-status calculation in the deslopification result assembly (the code that
currently assigns gate_status="blocked" from release_blocker findings) to also
treat findings representing missing audit coverage and unresolved
notation/concept-order flags as blocking: either treat those finding
types/severities as release_blocker-equivalent or add explicit checks that set
gate_status="blocked" when any finding has kind/type/flag indicating
missing_audit_coverage or notation/concept-order unresolved (reference the
variables/fields release_blocker, gate_status and the detector outputs from tell
detectors in the deslopification result code).

In `@src/gpd/specs/templates/paper/deslop-audit-schema.md`:
- Around line 13-22: The audit entry producer in src/gpd/core/deslopification.py
(around the block at lines 253-276 and the markdown totals at 279-287) currently
emits location: {"line": ...} and omits claim_ledger_changed; update that code
path to produce the canonical shape expected by the spec: emit location as
{"file": <filename>, "line_start": <start>, "line_end": <end>} and add
claim_ledger_changed (integer, default 0) to each audit entry object, and also
include claim_ledger_changed=0 in the markdown-rendered totals logic so the
rendered totals match the schema.

In `@src/gpd/specs/templates/paper/deslop-flags-schema.md`:
- Around line 15-23: The DESLOP-FLAGS template schema must match the runtime
emission from src/gpd/core/deslopification.py (around the code that emits
DESLOP-FLAGS): update this template to use the same "category" taxonomy
(tell-tag values like "placeholder_or_metadata"), change "location" to the
runtime shape {line:int, col:int} (instead of file/line_start/line_end), and
remove or make optional the "recommended_author_action" and "delegate_to" fields
that the runtime omits so consumers can parse consistently; ensure field names
and types exactly mirror what deslopification.py emits.

In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md`:
- Around line 117-125: The spec currently contradicts itself for
supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked)
the gate must fail-closed and not proceed, but the second option "accept as a
known limitation" allows proceeding; update the supervised|balanced branch (the
block describing autonomy=supervised|balanced and the listed options) to remove
or redefine option (2) so it enforces blocker resolution before
proceeding—either delete the "accept as a known limitation" choice or change it
to a non-proceeding state (e.g., "record as known limitation and put manuscript
on hold") and ensure the text explicitly requires re-running this gate after
resolution; reference release_blocker_count, gate_status,
autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making
the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 01cacca5-0aa3-4885-a092-6eaf27968ea1

📥 Commits

Reviewing files that changed from the base of the PR and between 0f41769 and b502105.

📒 Files selected for processing (11)

src/gpd/agents/gpd-discipline-editor.md
src/gpd/cli.py
src/gpd/commands/arxiv-submission.md
src/gpd/commands/deslop-paper.md
src/gpd/core/deslopification.py
src/gpd/specs/references/publication/deslopification-gate.md
src/gpd/specs/templates/paper/deslop-audit-schema.md
src/gpd/specs/templates/paper/deslop-flags-schema.md
src/gpd/specs/workflows/write-paper/publication-review-finalization.md
tests/core/test_deslopification.py
tests/test_deslop_cli_commands.py

coderabbitai · 2026-06-04T22:06:16Z

+Pass D; Emit the artifacts (non-optional): `DESLOP-AUDIT.jsonl` + `DESLOP-AUDIT.md` (one record per edit, by line, with `meaning_preserving: yes`), `DESLOP-FLAGS.md`, and `DESLOP-SUMMARY.json`. In `apply` mode also write the edited `.tex`; in `audit` mode do not touch the manuscript.
+</method>


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

“Non-optional artifacts” requirement conflicts with current apply-only audit emission.

This says DESLOP-AUDIT.jsonl and .md are always emitted, but src/gpd/core/deslopification.py:321-339 writes those only in apply mode (and only if invariants pass). Please reconcile this requirement so the agent contract matches executable behavior.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/agents/gpd-discipline-editor.md` around lines 64 - 65, The documentation requires DESLOP-AUDIT.jsonl and DESLOP-AUDIT.md to be emitted unconditionally, but the current implementation in src/gpd/core/deslopification.py only writes them inside the apply-mode block (lines ~321-339) after invariant checks; reconcile by extracting the audit-emission logic into a shared routine (e.g., emit_deslop_audit or write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure it sets meaning_preserving: yes and writes one record per edit), while keeping the edited .tex write only in apply() and preserving the invariants behavior (or log invariant failures while still emitting the audit artifacts in audit mode if that matches the contract).

coderabbitai · 2026-06-04T22:06:16Z

+@deslop_app.command("check")
+def deslop_check(
+    before: str = typer.Argument(..., help="Path to the pre-edit text"),
+    after: str = typer.Argument(..., help="Path to the post-edit text"),
+) -> None:
+    """Prove an edit changed no protected span (math, citations, numbers, theorem status)."""
+    from gpd.core.deslopification import check_invariants
+    from gpd.core.utils import safe_read_file
+
+    b = safe_read_file(_resolve_path_from_effective_cwd(before)) or ""
+    a = safe_read_file(_resolve_path_from_effective_cwd(after)) or ""
+    report = check_invariants(b, a, is_tex=before.endswith(".tex"))
+    _output(report)
+    if not report["passed"]:
+        raise typer.Exit(code=2)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on unreadable inputs in deslop_check (don’t coerce to empty text).

Using safe_read_file(... ) or "" can let missing/unreadable files be treated as empty content, which can incorrectly pass invariant checks and bypass the gate. Also, derive is_tex from the resolved path suffix (case-insensitive) instead of raw string matching.

Suggested fix

`@deslop_app.command`("check") def deslop_check( @@ ) -> None: @@ - from gpd.core.deslopification import check_invariants - from gpd.core.utils import safe_read_file - - b = safe_read_file(_resolve_path_from_effective_cwd(before)) or "" - a = safe_read_file(_resolve_path_from_effective_cwd(after)) or "" - report = check_invariants(b, a, is_tex=before.endswith(".tex")) + from gpd.core.deslopification import check_invariants + + before_path, b = _load_text_document_or_error(before) + _after_path, a = _load_text_document_or_error(after) + report = check_invariants(b, a, is_tex=before_path.suffix.lower() == ".tex") _output(report) if not report["passed"]: raise typer.Exit(code=2)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/cli.py` around lines 9552 - 9566, The deslop_check handler currently coerces unreadable files to empty strings and derives is_tex from the raw argument, which can allow missing inputs to slip past checks; instead, resolve each path with _resolve_path_from_effective_cwd, call safe_read_file and if it returns None treat that as a hard error (emit a clear message via typer.echo/process logger and exit with a non-zero code) rather than substituting "", then pass the actual file contents to check_invariants; also determine is_tex from the resolved path's suffix in a case-insensitive way (e.g., resolved_path.suffix.lower() == ".tex") before calling check_invariants.

coderabbitai · 2026-06-04T22:06:16Z

+gpd:deslop-paper GPD/publication/<slug>/manuscript --mode apply --strict
+gpd:deslop-paper <manuscript> --mode ci            # fail if release-blocking slop remains


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Locate deslop-paper command/spec wiring:" rg -n -C3 'deslop-paper|--strict|strict' src tests echo echo "Locate CLI options for deslop-related commands:" rg -n -C3 '@.*command\(|def .*deslop|--mode|--strict|no-write' src/gpd/cli.py echo echo "Locate strict-mode assertions in tests:" rg -n -C3 'deslop.*strict|--strict|gate_status|blocked|exit code|non-zero' tests

Repository: psi-oss/get-physics-done

Length of output: 50382

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Find all occurrences of the command name `deslop-paper`:" rg -n "deslop-paper" src tests echo echo "Find all occurrences of `deslop` in the CLI entrypoint:" rg -n "deslop" src/gpd/cli.py echo echo "Search CLI option declarations for `deslop`-related commands (narrow patterns):" rg -n --context 3 "(deslop.*command|command.*deslop|def .*deslop|--mode|--no-write|--strict)" src/gpd/cli.py echo echo "Search for strict parsing/enforcement around deslop in the CLI tree:" rg -n --context 3 "(deslop.*strict|strict.*deslop|--strict.*deslop|deslop.*--strict)" src/gpd

Repository: psi-oss/get-physics-done

Length of output: 6323

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Locate core deslopification implementation for scan/check:" fd -t f "deslopification*" src/gpd/core || true rg -n "def (scan_manuscript|check_invariants)\b|scan_manuscript\(" -S src/gpd/core echo echo "Find strict handling inside deslopification core:" rg -n "strict" -S src/gpd/core/deslopification.py src/gpd/core || true echo echo "Inspect deslop-paper command spec template/wiring that runs `gpd --raw init deslop-paper`:" sed -n '1,140p' src/gpd/commands/deslop-paper.md

Repository: psi-oss/get-physics-done

Length of output: 50441

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect deslop-paper spec ===" sed -n '1,140p' src/gpd/commands/deslop-paper.md echo sed -n '140,260p' src/gpd/commands/deslop-paper.md || true echo echo "=== Inspect deslopification core ===" sed -n '1,220p' src/gpd/core/deslopification.py echo sed -n '220,520p' src/gpd/core/deslopification.py echo echo "=== Search for strict handling inside core/deslopification.py ===" rg -n "strict" src/gpd/core/deslopification.py echo echo "=== Trace how `init deslop-paper` is dispatched/parsed ===" rg -n "init deslop-paper|deslop-paper" src/gpd | head -n 200 rg -n "deslopification|scan_manuscript|check_invariants" src/gpd | head -n 200 echo echo "=== Search for scan_manuscript callers with strict kwarg ===" rg -n "scan_manuscript\\(.*strict|check_invariants\\(.*strict|deslop.*strict" src/gpd/core | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 22946

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Find strict usage in deslop-paper spec ==" rg -n "strict" src/gpd/commands/deslop-paper.md echo echo "== Show remaining parts of deslop-paper spec (possible exit logic) ==" sed -n '260,520p' src/gpd/commands/deslop-paper.md echo echo "== Search for deslop-paper strict behavior in code/tests ==" rg -n --no-heading "(deslop-paper|DESLOP-SUMMARY|DESLOP-FLAGS|gate_status: blocked|gate_status).*strict" src tests -S || true echo echo "== Search for any tests covering deslop-paper/strict ==" rg -n --no-heading "deslop-paper" tests -S || true echo echo "== Search for deslop-paper command handler / init dispatch ==" rg -n --no-heading "init deslop-paper|deslop-paper" src/gpd -S | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 2345

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Find where gpd:deslop-paper is registered/mentioned in code ==" rg -n "gpd:deslop-paper|deslop-paper" src/gpd -S echo echo "== Find init command implementation (the `gpd --raw init deslop-paper` path) ==" rg -n "raw.*init|@.*init\\b|command\$\"init\"\$|def .*init\\b" src/gpd/cli.py src/gpd/core -S echo echo "== Check for a generic --strict option in CLI command initialization ==" rg -n "--strict" src/gpd/cli.py src/gpd/core/ -S echo echo "== Trace how command specs frontmatter like argument-hint/display_signature are turned into CLI params ==" rg -n "argument-hint|display_signature|registry.*command|parse.*command.*spec|allowed-tools" src/gpd/core src/gpd/registry* src/gpd/ -S | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 4367

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Where does CLI define --strict? (top-level and subcommands) ==" rg -n --context 2 "--strict" src/gpd/cli.py | head -n 200 echo echo "== Inspect gpd --raw init implementation in src/gpd/cli.py ==" sed -n '3680,3860p' src/gpd/cli.py echo echo "== Search init payload schema for a 'strict' field ==" rg -n --context 2 "raw.*init|init\(" src/gpd/core src/gpd/cli.py | head -n 200 rg -n --context 2 "\"strict\"|strict:" src/gpd/core | head -n 200 echo echo "== Search for how command-spec args are parsed/passed to init execution ==" rg -n --context 2 "display_signature|argument-hint|command spec|frontmatter|allowed-tools|parse.*frontmatter" src/gpd/core src/gpd/cli.py | head -n 200 echo echo "== Find how command name 'deslop-paper' is loaded into init execution ==" rg -n --context 2 "load.*command|registry.*command|commands/.*\.md|deslop-paper" src/gpd/core src/gpd/cli.py | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 563

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Find where DESLOP-SUMMARY.json is read/parsed ==" rg -n --context 3 "DESLOP-SUMMARY\.json|DESLOP-FLAGS\.md|DESLOP-AUDIT\.md|DESLOP-AUDIT\.jsonl" src/gpd -S echo echo "== Find any strict/flags exit logic for deslopification artifacts ==" rg -n --context 3 "(deslop|DESLOP).*(strict|flag|blocker|gate_status)" src/gpd -S echo echo "== Find any generic exit condition logic keyed by gate_status for deslopification ==" rg -n --context 3 "gate_status.*(strict|flag|blocker)|release_blocker_count" src/gpd -S

Repository: psi-oss/get-physics-done

Length of output: 34070

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Search for literal CLI option '--strict' in gpd CLI implementation ==" rg -n --context 3 -- "--strict" src/gpd/cli.py echo echo "== Search for strict handling specifically tied to deslop-paper/deslopification in src/gpd ==" rg -n --context 3 "(deslop-paper|deslopification).*strict|strict.*(deslop-paper|deslopification)" src/gpd -S echo echo "== Search for 'strict' in the deslopification gate/workflow templates ==" rg -n --context 3 "deslopification-gate|DESLOP|deslop scan|deslop-paper|gate_status" src/gpd/specs/workflows src/gpd/commands/deslop-paper.md -S

Repository: psi-oss/get-physics-done

Length of output: 37731

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Exit/strict logic around gate_status & release_blocker_count (deslop) ==" rg -n --context 3 "(deslop|DESLOP|deslopification).*?(gate_status|release_blocker_count|flag_count)" src/gpd -S echo echo "== Search for strict gating keyed by 'flag'/'blocker' in general workflow runner ==" rg -n --context 3 "strict.*(gate_status|release_blocker_count|flag_count|FLAG|blocker)" src/gpd -S echo echo "== Search for generic command-spec 'strict' parameter propagation into init/run ==" rg -n --context 3 "(raw init|init\\s+|command spec|frontmatter|display_signature|argument-hint).*strict" src/gpd -S echo echo "== Search for deslop-paper used as an authority/command spec in registry ==" rg -n --context 3 "deslop-paper" src/gpd/core src/gpd/cli.py -S

Repository: psi-oss/get-physics-done

Length of output: 9401

Fix misleading --strict documentation/behavior for gpd:deslop-paper
src/gpd/commands/deslop-paper.md claims --strict makes the command fail when any unresolved FLAG remains (non-zero exit), but the deslopification implementation only returns non-zero when gate_status is blocked (i.e., release blockers), with no strict input anywhere in the deslopification scan flow. The workflow wiring likewise gates on release_blocker_count, not unresolved non-blocker flags—so the --strict semantics described in the spec/snippets at lines 33-34, 43-44, and 76-77 are not actually enforced. Update the snippets/spec to match reality or wire --strict end-to-end (init → scan/check → exit logic).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/commands/deslop-paper.md` around lines 33 - 34, The docs claim gpd:deslop-paper --strict fails on any unresolved FLAG but the code only checks gate_status/release_blocker_count; fix by wiring --strict end-to-end: add a --strict boolean to the gpd:deslop-paper CLI handler and pass it into the deslopification scan flow (propagate through the function/method that performs deslopification), compute unresolved_flags_count (count of unresolved non-blocker flags) during the scan alongside release_blocker_count and gate_status, and update the final exit logic to return non-zero when --strict is true and unresolved_flags_count > 0 (while preserving existing behavior when --strict is false). Alternatively, if you prefer not to change behavior, update the doc snippets to remove/clarify --strict and state that only release_blocker_count/gate_status cause non-zero exit; make changes in the gpd:deslop-paper docs and any workflow wiring that currently gates on release_blocker_count to reflect the chosen semantics.

coderabbitai · 2026-06-04T22:06:16Z

+try:
+    from gpd.core.arxiv_package import (  # type: ignore
+        _CITATION_RE as _CITATION_RE,
+        _PLACEHOLDER_RE as _PLACEHOLDER_RE,
+        _strip_latex_comments as _strip_latex_comments,
+    )
+except Exception:  # standalone / demo path


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast on internal import bugs instead of silently using demo fallbacks.

These except Exception blocks also catch runtime errors raised inside gpd.core.arxiv_package / gpd.core.utils, so a broken in-repo dependency quietly downgrades the gate to the local fallback regexes and file helpers. That can change what gets protected or written without any visible failure.

Suggested fix

-try: +try: from gpd.core.arxiv_package import ( # type: ignore _CITATION_RE as _CITATION_RE, _PLACEHOLDER_RE as _PLACEHOLDER_RE, _strip_latex_comments as _strip_latex_comments, ) -except Exception: # standalone / demo path +except ImportError: # standalone / demo path

-try: +try: from gpd.core.utils import atomic_write as _atomic_write, safe_read_file as _safe_read # type: ignore -except Exception: +except ImportError:

Also applies to: 58-60

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/core/deslopification.py` around lines 32 - 38, The current broad "except Exception" around importing _CITATION_RE, _PLACEHOLDER_RE, and _strip_latex_comments from gpd.core.arxiv_package (and the similar block at lines ~58-60) masks runtime errors inside those modules; change the handlers to only catch ImportError/ModuleNotFoundError (e.g., "except ImportError:" or "except (ImportError, ModuleNotFoundError) as e:") and let any other Exception propagate (or re-raise) so in-repo failures fail fast instead of silently falling back to demo replacements.

coderabbitai · 2026-06-04T22:06:16Z

+_INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL)
+_PAREN_MATH = re.compile(r"\\\((.+?)\\\)", re.DOTALL)
+_DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL)
+_ENV_MATH = re.compile(
+    r"\\begin\{(equation\*?|align\*?|gather\*?|multline\*?|eqnarray\*?)\}(.+?)\\end\{\1\}", re.DOTALL
+)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

$$...$$ display math is currently invisible to the invariant checker.

extract_protected_spans never records raw TeX display math, so a change like $$x=1$$ → $$x=2$$ slips through as passed=True. That leaves a direct hole in the “math never changes” guarantee.

Suggested fix

+_DOUBLE_DOLLAR_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL) _INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL) _PAREN_MATH = re.compile(r"\\$(.+?)\\$", re.DOTALL) _DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL)

- for rx in (_INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH): + for rx in (_DOUBLE_DOLLAR_MATH, _INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH): math.update(_norm(m.group(1)) for m in rx.finditer(body))

Also applies to: 104-106

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/core/deslopification.py` around lines 78 - 83, The invariant checker misses TeX display math delimited by $$...$$ because extract_protected_spans never records that pattern; add a regex for dollar-delimited display math (e.g., a new _DOLLAR_DISPLAY_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)) and ensure extract_protected_spans treats _DOLLAR_DISPLAY_MATH the same as _DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and excluded from the change comparison; update any other places referencing _DISPLAY_MATH (and the similar patterns around lines where _PAREN_MATH/_DISPLAY_MATH/_ENV_MATH are defined) so the new symbol is included in the protected-span extraction logic.

coderabbitai · 2026-06-04T22:06:16Z

+</math_physics_tell_catalogue>
+
+<concept_intro_gate>
+For every nonstandard term/symbol, the notation ledger records `{object, kind, first_use,
+first_definition, motivation_before_definition, example_present, nonexample_or_boundary_case_present,
+used_in_theorem_or_proof, one_use_only, status}`. Fail-closed checks (flag, do not auto-fix):
+
+```
+CONCEPT_INTRO_GATE:
+  require first_definition <= first_technical_use
+  require motivation before definition unless locally standard
+  require example or boundary case for new named machinery
+  require symbol-collision check against universal conventions (∇, ∂, ℏ, ...)
+  flag one-use symbols
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix markdown/table and fenced-code lint violations in this section.

Line 91 breaks table pipe/column rules, and Line 98 has a fenced block without language. Also, true_is_forbidden is not valid JSON and may confuse schema readers.

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing leading pipe

(MD055, table-pipe-style)

[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing trailing pipe

(MD055, table-pipe-style)

[warning] 91-91: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data

(MD056, table-column-count)

[warning] 98-98: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines 91 - 105, Fix the markup in the CONCEPT_INTRO_GATE block: correct the broken table/pipe syntax (ensure rows use consistent | column separators or remove the stray table delimiter around the `<concept_intro_gate>` section), add a language identifier to the fenced code block (e.g., ```yaml) to satisfy fenced-code linting, and replace the invalid token `true_is_forbidden` with valid JSON/YAML boolean or a quoted string (e.g., true or "true_is_forbidden") so schema readers can parse the gate definition; refer to the CONCEPT_INTRO_GATE block and the fenced code block shown in the diff when making these edits.

coderabbitai · 2026-06-04T22:06:16Z

+`ci` mode sets `blocked` if any release blocker remains: public-facing scaffolding leakage,
+placeholder/submission-time-check citations, missing/incomplete audit coverage, or unresolved
+notation/concept-order flags.
+</gate_status>


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

ci blocking semantics here exceed what the engine currently enforces.

This section says ci blocks for missing audit coverage and unresolved notation/concept-order flags, but src/gpd/core/deslopification.py:307-339 sets gate_status="blocked" only from release_blocker findings produced by tell detectors. That contract gap can let CI pass when this spec says it must fail.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines 126 - 129, The spec says `ci` should block for missing audit coverage and unresolved notation/concept-order flags, but the gate_status is only set to "blocked" when a finding is marked release_blocker; update the gate-status calculation in the deslopification result assembly (the code that currently assigns gate_status="blocked" from release_blocker findings) to also treat findings representing missing audit coverage and unresolved notation/concept-order flags as blocking: either treat those finding types/severities as release_blocker-equivalent or add explicit checks that set gate_status="blocked" when any finding has kind/type/flag indicating missing_audit_coverage or notation/concept-order unresolved (reference the variables/fields release_blocker, gate_status and the detector outputs from tell detectors in the deslopification result code).

coderabbitai · 2026-06-04T22:06:16Z

+  "edit_id": "DSE-0042",
+  "location": {"file": "main.tex", "line_start": 281, "line_end": 286},
+  "original": "Disposition tag. Per PFAFFIAN-APPLICABILITY.md §7.3, the disposition is (b) CONDITIONAL HOLDS...",
+  "new": "The decidability result is conditional on Conjecture CPA. Existing Pfaffian and cellular-decomposition results provide the framework, but the mirror-octic constants have not yet been computed.",
+  "tell_addressed": "agent_scaffolding_leakage",
+  "rationale": "Removes internal project-provenance language while preserving the public conditional status.",
+  "meaning_preserving": "yes",
+  "protected_spans_changed": false,
+  "claim_ledger_changed": false
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Audit schema fields do not match the current producer output.

The schema requires location.file/line_start/line_end and claim_ledger_changed, but src/gpd/core/deslopification.py:253-276 currently writes location: {"line": ...} and no claim_ledger_changed. Also the markdown-rendered totals in src/gpd/core/deslopification.py:279-287 do not include claim_ledger_changed=0 as specified here. Please align spec and producer to one canonical shape.

Also applies to: 49-50

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/templates/paper/deslop-audit-schema.md` around lines 13 - 22, The audit entry producer in src/gpd/core/deslopification.py (around the block at lines 253-276 and the markdown totals at 279-287) currently emits location: {"line": ...} and omits claim_ledger_changed; update that code path to produce the canonical shape expected by the spec: emit location as {"file": <filename>, "line_start": <start>, "line_end": <end>} and add claim_ledger_changed (integer, default 0) to each audit entry object, and also include claim_ledger_changed=0 in the markdown-rendered totals logic so the rendered totals match the schema.

coderabbitai · 2026-06-04T22:06:16Z

+  "severity": "blocker | major | minor",
+  "category": "rigor | notation | citation | physics | evidence | metadata | theorem_status",
+  "location": {"file": "references.tex", "line_start": 412, "line_end": 414},
+  "excerpt": "Submission-time check: confirm exact published title...",
+  "why_not_auto_edited": "The correct bibliographic data must be verified from a source; inventing it would violate the no-misrepresentation rule.",
+  "recommended_author_action": "Verify the final published title, venue, volume, pages, DOI/arXiv identifier, then rerun bibliography audit.",
+  "blocks_public_release": true,
+  "delegate_to": "gpd-bibliographer"
+}


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

DESLOP-FLAGS schema diverges from runtime output format.

src/gpd/core/deslopification.py:351-367 emits category as tell tags (e.g., placeholder_or_metadata), location as {line,col}, and omits recommended_author_action and delegate_to. This template defines a different required shape and taxonomy, so consumers cannot reliably parse both.

Also applies to: 26-35

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/templates/paper/deslop-flags-schema.md` around lines 15 - 23, The DESLOP-FLAGS template schema must match the runtime emission from src/gpd/core/deslopification.py (around the code that emits DESLOP-FLAGS): update this template to use the same "category" taxonomy (tell-tag values like "placeholder_or_metadata"), change "location" to the runtime shape {line:int, col:int} (instead of file/line_start/line_end), and remove or make optional the "recommended_author_action" and "delegate_to" fields that the runtime omits so consumers can parse consistently; ensure field names and types exactly mirror what deslopification.py emits.

coderabbitai · 2026-06-04T22:06:16Z

+3. If `release_blocker_count > 0` (`gate_status: blocked`): do NOT proceed to peer
+   review.
+   - `autonomy=yolo`: record blockers in `CRITIQUE-LOG.md` and `gpd_return.issues`;
+     recommend the author actions in `DESLOP-FLAGS.md` (resolve placeholder citations,
+     scaffolding leakage, notation order).
+   - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
+     why, recommended action) and ask whether to (1) resolve now via the delegated
+     owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this
+     gate after resolution.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the blocker-decision contradiction in supervised/balanced mode.

Line 117 says blocked manuscripts must not proceed, but Line 123 includes “accept as a known limitation,” which conflicts with fail-closed semantics and with Line 125 (“Re-run this gate after resolution”). Remove or redefine option (2) so every allowed path is consistent with blocker resolution before proceeding.

Suggested spec-tightening diff

- - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location, - why, recommended action) and ask whether to (1) resolve now via the delegated - owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this - gate after resolution. + - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location, + why, recommended action) and ask whether to (1) resolve now via the delegated + owner, or (2) hold the manuscript. Re-run this gate after resolution.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md` around lines 117 - 125, The spec currently contradicts itself for supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked) the gate must fail-closed and not proceed, but the second option "accept as a known limitation" allows proceeding; update the supervised|balanced branch (the block describing autonomy=supervised|balanced and the listed options) to remove or redefine option (2) so it enforces blocker resolution before proceeding—either delete the "accept as a known limitation" choice or change it to a non-proceeding state (e.g., "record as known limitation and put manuscript on hold") and ensure the text explicitly requires re-running this gate after resolution; reference release_blocker_count, gate_status, autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making the change.

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

		Pass D; Emit the artifacts (non-optional): `DESLOP-AUDIT.jsonl` + `DESLOP-AUDIT.md` (one record per edit, by line, with `meaning_preserving: yes`), `DESLOP-FLAGS.md`, and `DESLOP-SUMMARY.json`. In `apply` mode also write the edited `.tex`; in `audit` mode do not touch the manuscript.
		</method>

		gpd:deslop-paper GPD/publication/<slug>/manuscript --mode apply --strict
		gpd:deslop-paper <manuscript> --mode ci # fail if release-blocking slop remains

Conversation

davidturturean commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

How it works

Commands

Where it runs

Tests

Summary by CodeRabbit

Uh oh!

CLAassistant commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

marcpickett1 commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidturturean commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading