Add a deslopification gate for finalized manuscripts#256
Conversation
A meaning-preserving pass that strips public-facing AI and agent writing tells from a manuscript while freezing all scientific content, emits a by-line audit, and flags substantive issues instead of rewriting them. Deterministic core (core/deslopification.py): protected-span extraction, located tell detection, and an invariant checker that rejects any edit which changes a math span, citation key, in-math number, or theorem status. The gpd-discipline-editor agent does the nuanced rewrites, each gated by the same checker. Adds gpd deslop scan/check, gpd validate deslop-invariants, and the gpd:deslop-paper command; wires the gate into write-paper finalization and the arxiv-submission blocking conditions. 11 tests; existing CLI tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
|
📝 WalkthroughWalkthroughThis PR introduces a complete deslopification system that removes public-facing AI/agent artifacts from academic manuscripts while preserving scientific content through frozen invariants. The implementation includes a deterministic engine, detailed specifications, command orchestration, CLI integration, and comprehensive test coverage. ChangesDeslopification Gate Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🤖 RoastBot: You wrote a feature to detect slop... in an AI tool... using an AI. The ouroboros of self-loathing is complete. |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/gpd/agents/gpd-discipline-editor.md`:
- Around line 64-65: The documentation requires DESLOP-AUDIT.jsonl and
DESLOP-AUDIT.md to be emitted unconditionally, but the current implementation in
src/gpd/core/deslopification.py only writes them inside the apply-mode block
(lines ~321-339) after invariant checks; reconcile by extracting the
audit-emission logic into a shared routine (e.g., emit_deslop_audit or
write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure
it sets meaning_preserving: yes and writes one record per edit), while keeping
the edited .tex write only in apply() and preserving the invariants behavior (or
log invariant failures while still emitting the audit artifacts in audit mode if
that matches the contract).
In `@src/gpd/cli.py`:
- Around line 9552-9566: The deslop_check handler currently coerces unreadable
files to empty strings and derives is_tex from the raw argument, which can allow
missing inputs to slip past checks; instead, resolve each path with
_resolve_path_from_effective_cwd, call safe_read_file and if it returns None
treat that as a hard error (emit a clear message via typer.echo/process logger
and exit with a non-zero code) rather than substituting "", then pass the actual
file contents to check_invariants; also determine is_tex from the resolved
path's suffix in a case-insensitive way (e.g., resolved_path.suffix.lower() ==
".tex") before calling check_invariants.
In `@src/gpd/commands/deslop-paper.md`:
- Around line 33-34: The docs claim gpd:deslop-paper --strict fails on any
unresolved FLAG but the code only checks gate_status/release_blocker_count; fix
by wiring --strict end-to-end: add a --strict boolean to the gpd:deslop-paper
CLI handler and pass it into the deslopification scan flow (propagate through
the function/method that performs deslopification), compute
unresolved_flags_count (count of unresolved non-blocker flags) during the scan
alongside release_blocker_count and gate_status, and update the final exit logic
to return non-zero when --strict is true and unresolved_flags_count > 0 (while
preserving existing behavior when --strict is false). Alternatively, if you
prefer not to change behavior, update the doc snippets to remove/clarify
--strict and state that only release_blocker_count/gate_status cause non-zero
exit; make changes in the gpd:deslop-paper docs and any workflow wiring that
currently gates on release_blocker_count to reflect the chosen semantics.
In `@src/gpd/core/deslopification.py`:
- Around line 78-83: The invariant checker misses TeX display math delimited by
$$...$$ because extract_protected_spans never records that pattern; add a regex
for dollar-delimited display math (e.g., a new _DOLLAR_DISPLAY_MATH =
re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)) and ensure
extract_protected_spans treats _DOLLAR_DISPLAY_MATH the same as
_DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and excluded
from the change comparison; update any other places referencing _DISPLAY_MATH
(and the similar patterns around lines where _PAREN_MATH/_DISPLAY_MATH/_ENV_MATH
are defined) so the new symbol is included in the protected-span extraction
logic.
- Around line 307-310: The code currently treats read failures as empty
manuscripts by using `_safe_read(path) or ""`; change this to fail hard: call
`_safe_read(path)`, check if the result is None (or an explicit error sentinel)
and if so raise an appropriate exception (e.g., FileNotFoundError or IOError)
with the path and underlying error info instead of substituting ""; apply the
same fix to the other occurrence that uses `_safe_read` (the similar block
around lines 385-388) so both `scan_manuscript` and the duplicate reader call
stop returning false-clean results for unreadable files.
- Around line 32-38: The current broad "except Exception" around importing
_CITATION_RE, _PLACEHOLDER_RE, and _strip_latex_comments from
gpd.core.arxiv_package (and the similar block at lines ~58-60) masks runtime
errors inside those modules; change the handlers to only catch
ImportError/ModuleNotFoundError (e.g., "except ImportError:" or "except
(ImportError, ModuleNotFoundError) as e:") and let any other Exception propagate
(or re-raise) so in-repo failures fail fast instead of silently falling back to
demo replacements.
In `@src/gpd/specs/references/publication/deslopification-gate.md`:
- Around line 91-105: Fix the markup in the CONCEPT_INTRO_GATE block: correct
the broken table/pipe syntax (ensure rows use consistent | column separators or
remove the stray table delimiter around the `<concept_intro_gate>` section), add
a language identifier to the fenced code block (e.g., ```yaml) to satisfy
fenced-code linting, and replace the invalid token `true_is_forbidden` with
valid JSON/YAML boolean or a quoted string (e.g., true or "true_is_forbidden")
so schema readers can parse the gate definition; refer to the CONCEPT_INTRO_GATE
block and the fenced code block shown in the diff when making these edits.
- Around line 126-129: The spec says `ci` should block for missing audit
coverage and unresolved notation/concept-order flags, but the gate_status is
only set to "blocked" when a finding is marked release_blocker; update the
gate-status calculation in the deslopification result assembly (the code that
currently assigns gate_status="blocked" from release_blocker findings) to also
treat findings representing missing audit coverage and unresolved
notation/concept-order flags as blocking: either treat those finding
types/severities as release_blocker-equivalent or add explicit checks that set
gate_status="blocked" when any finding has kind/type/flag indicating
missing_audit_coverage or notation/concept-order unresolved (reference the
variables/fields release_blocker, gate_status and the detector outputs from tell
detectors in the deslopification result code).
In `@src/gpd/specs/templates/paper/deslop-audit-schema.md`:
- Around line 13-22: The audit entry producer in src/gpd/core/deslopification.py
(around the block at lines 253-276 and the markdown totals at 279-287) currently
emits location: {"line": ...} and omits claim_ledger_changed; update that code
path to produce the canonical shape expected by the spec: emit location as
{"file": <filename>, "line_start": <start>, "line_end": <end>} and add
claim_ledger_changed (integer, default 0) to each audit entry object, and also
include claim_ledger_changed=0 in the markdown-rendered totals logic so the
rendered totals match the schema.
In `@src/gpd/specs/templates/paper/deslop-flags-schema.md`:
- Around line 15-23: The DESLOP-FLAGS template schema must match the runtime
emission from src/gpd/core/deslopification.py (around the code that emits
DESLOP-FLAGS): update this template to use the same "category" taxonomy
(tell-tag values like "placeholder_or_metadata"), change "location" to the
runtime shape {line:int, col:int} (instead of file/line_start/line_end), and
remove or make optional the "recommended_author_action" and "delegate_to" fields
that the runtime omits so consumers can parse consistently; ensure field names
and types exactly mirror what deslopification.py emits.
In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md`:
- Around line 117-125: The spec currently contradicts itself for
supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked)
the gate must fail-closed and not proceed, but the second option "accept as a
known limitation" allows proceeding; update the supervised|balanced branch (the
block describing autonomy=supervised|balanced and the listed options) to remove
or redefine option (2) so it enforces blocker resolution before
proceeding—either delete the "accept as a known limitation" choice or change it
to a non-proceeding state (e.g., "record as known limitation and put manuscript
on hold") and ensure the text explicitly requires re-running this gate after
resolution; reference release_blocker_count, gate_status,
autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making
the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 01cacca5-0aa3-4885-a092-6eaf27968ea1
📒 Files selected for processing (11)
src/gpd/agents/gpd-discipline-editor.mdsrc/gpd/cli.pysrc/gpd/commands/arxiv-submission.mdsrc/gpd/commands/deslop-paper.mdsrc/gpd/core/deslopification.pysrc/gpd/specs/references/publication/deslopification-gate.mdsrc/gpd/specs/templates/paper/deslop-audit-schema.mdsrc/gpd/specs/templates/paper/deslop-flags-schema.mdsrc/gpd/specs/workflows/write-paper/publication-review-finalization.mdtests/core/test_deslopification.pytests/test_deslop_cli_commands.py
| Pass D; Emit the artifacts (non-optional): `DESLOP-AUDIT.jsonl` + `DESLOP-AUDIT.md` (one record per edit, by line, with `meaning_preserving: yes`), `DESLOP-FLAGS.md`, and `DESLOP-SUMMARY.json`. In `apply` mode also write the edited `.tex`; in `audit` mode do not touch the manuscript. | ||
| </method> |
There was a problem hiding this comment.
“Non-optional artifacts” requirement conflicts with current apply-only audit emission.
This says DESLOP-AUDIT.jsonl and .md are always emitted, but src/gpd/core/deslopification.py:321-339 writes those only in apply mode (and only if invariants pass). Please reconcile this requirement so the agent contract matches executable behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/agents/gpd-discipline-editor.md` around lines 64 - 65, The
documentation requires DESLOP-AUDIT.jsonl and DESLOP-AUDIT.md to be emitted
unconditionally, but the current implementation in
src/gpd/core/deslopification.py only writes them inside the apply-mode block
(lines ~321-339) after invariant checks; reconcile by extracting the
audit-emission logic into a shared routine (e.g., emit_deslop_audit or
write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure
it sets meaning_preserving: yes and writes one record per edit), while keeping
the edited .tex write only in apply() and preserving the invariants behavior (or
log invariant failures while still emitting the audit artifacts in audit mode if
that matches the contract).
| @deslop_app.command("check") | ||
| def deslop_check( | ||
| before: str = typer.Argument(..., help="Path to the pre-edit text"), | ||
| after: str = typer.Argument(..., help="Path to the post-edit text"), | ||
| ) -> None: | ||
| """Prove an edit changed no protected span (math, citations, numbers, theorem status).""" | ||
| from gpd.core.deslopification import check_invariants | ||
| from gpd.core.utils import safe_read_file | ||
|
|
||
| b = safe_read_file(_resolve_path_from_effective_cwd(before)) or "" | ||
| a = safe_read_file(_resolve_path_from_effective_cwd(after)) or "" | ||
| report = check_invariants(b, a, is_tex=before.endswith(".tex")) | ||
| _output(report) | ||
| if not report["passed"]: | ||
| raise typer.Exit(code=2) |
There was a problem hiding this comment.
Fail closed on unreadable inputs in deslop_check (don’t coerce to empty text).
Using safe_read_file(... ) or "" can let missing/unreadable files be treated as empty content, which can incorrectly pass invariant checks and bypass the gate. Also, derive is_tex from the resolved path suffix (case-insensitive) instead of raw string matching.
Suggested fix
`@deslop_app.command`("check")
def deslop_check(
@@
) -> None:
@@
- from gpd.core.deslopification import check_invariants
- from gpd.core.utils import safe_read_file
-
- b = safe_read_file(_resolve_path_from_effective_cwd(before)) or ""
- a = safe_read_file(_resolve_path_from_effective_cwd(after)) or ""
- report = check_invariants(b, a, is_tex=before.endswith(".tex"))
+ from gpd.core.deslopification import check_invariants
+
+ before_path, b = _load_text_document_or_error(before)
+ _after_path, a = _load_text_document_or_error(after)
+ report = check_invariants(b, a, is_tex=before_path.suffix.lower() == ".tex")
_output(report)
if not report["passed"]:
raise typer.Exit(code=2)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/cli.py` around lines 9552 - 9566, The deslop_check handler currently
coerces unreadable files to empty strings and derives is_tex from the raw
argument, which can allow missing inputs to slip past checks; instead, resolve
each path with _resolve_path_from_effective_cwd, call safe_read_file and if it
returns None treat that as a hard error (emit a clear message via
typer.echo/process logger and exit with a non-zero code) rather than
substituting "", then pass the actual file contents to check_invariants; also
determine is_tex from the resolved path's suffix in a case-insensitive way
(e.g., resolved_path.suffix.lower() == ".tex") before calling check_invariants.
| gpd:deslop-paper GPD/publication/<slug>/manuscript --mode apply --strict | ||
| gpd:deslop-paper <manuscript> --mode ci # fail if release-blocking slop remains |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Locate deslop-paper command/spec wiring:"
rg -n -C3 'deslop-paper|--strict|strict' src tests
echo
echo "Locate CLI options for deslop-related commands:"
rg -n -C3 '@.*command\(|def .*deslop|--mode|--strict|no-write' src/gpd/cli.py
echo
echo "Locate strict-mode assertions in tests:"
rg -n -C3 'deslop.*strict|--strict|gate_status|blocked|exit code|non-zero' testsRepository: psi-oss/get-physics-done
Length of output: 50382
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Find all occurrences of the command name `deslop-paper`:"
rg -n "deslop-paper" src tests
echo
echo "Find all occurrences of `deslop` in the CLI entrypoint:"
rg -n "deslop" src/gpd/cli.py
echo
echo "Search CLI option declarations for `deslop`-related commands (narrow patterns):"
rg -n --context 3 "(deslop.*command|command.*deslop|def .*deslop|--mode|--no-write|--strict)" src/gpd/cli.py
echo
echo "Search for strict parsing/enforcement around deslop in the CLI tree:"
rg -n --context 3 "(deslop.*strict|strict.*deslop|--strict.*deslop|deslop.*--strict)" src/gpdRepository: psi-oss/get-physics-done
Length of output: 6323
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Locate core deslopification implementation for scan/check:"
fd -t f "deslopification*" src/gpd/core || true
rg -n "def (scan_manuscript|check_invariants)\b|scan_manuscript\(" -S src/gpd/core
echo
echo "Find strict handling inside deslopification core:"
rg -n "strict" -S src/gpd/core/deslopification.py src/gpd/core || true
echo
echo "Inspect deslop-paper command spec template/wiring that runs `gpd --raw init deslop-paper`:"
sed -n '1,140p' src/gpd/commands/deslop-paper.mdRepository: psi-oss/get-physics-done
Length of output: 50441
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Inspect deslop-paper spec ==="
sed -n '1,140p' src/gpd/commands/deslop-paper.md
echo
sed -n '140,260p' src/gpd/commands/deslop-paper.md || true
echo
echo "=== Inspect deslopification core ==="
sed -n '1,220p' src/gpd/core/deslopification.py
echo
sed -n '220,520p' src/gpd/core/deslopification.py
echo
echo "=== Search for strict handling inside core/deslopification.py ==="
rg -n "strict" src/gpd/core/deslopification.py
echo
echo "=== Trace how `init deslop-paper` is dispatched/parsed ==="
rg -n "init deslop-paper|deslop-paper" src/gpd | head -n 200
rg -n "deslopification|scan_manuscript|check_invariants" src/gpd | head -n 200
echo
echo "=== Search for scan_manuscript callers with strict kwarg ==="
rg -n "scan_manuscript\\(.*strict|check_invariants\\(.*strict|deslop.*strict" src/gpd/core | head -n 200Repository: psi-oss/get-physics-done
Length of output: 22946
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find strict usage in deslop-paper spec =="
rg -n "strict" src/gpd/commands/deslop-paper.md
echo
echo "== Show remaining parts of deslop-paper spec (possible exit logic) =="
sed -n '260,520p' src/gpd/commands/deslop-paper.md
echo
echo "== Search for deslop-paper strict behavior in code/tests =="
rg -n --no-heading "(deslop-paper|DESLOP-SUMMARY|DESLOP-FLAGS|gate_status: blocked|gate_status).*strict" src tests -S || true
echo
echo "== Search for any tests covering deslop-paper/strict =="
rg -n --no-heading "deslop-paper" tests -S || true
echo
echo "== Search for deslop-paper command handler / init dispatch =="
rg -n --no-heading "init deslop-paper|deslop-paper" src/gpd -S | head -n 200Repository: psi-oss/get-physics-done
Length of output: 2345
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find where gpd:deslop-paper is registered/mentioned in code =="
rg -n "gpd:deslop-paper|deslop-paper" src/gpd -S
echo
echo "== Find init command implementation (the `gpd --raw init deslop-paper` path) =="
rg -n "raw.*init|@.*init\\b|command\\(\"init\"\\)|def .*init\\b" src/gpd/cli.py src/gpd/core -S
echo
echo "== Check for a generic --strict option in CLI command initialization =="
rg -n "--strict" src/gpd/cli.py src/gpd/core/ -S
echo
echo "== Trace how command specs frontmatter like argument-hint/display_signature are turned into CLI params =="
rg -n "argument-hint|display_signature|registry.*command|parse.*command.*spec|allowed-tools" src/gpd/core src/gpd/registry* src/gpd/ -S | head -n 200Repository: psi-oss/get-physics-done
Length of output: 4367
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Where does CLI define --strict? (top-level and subcommands) =="
rg -n --context 2 "--strict" src/gpd/cli.py | head -n 200
echo
echo "== Inspect gpd --raw init implementation in src/gpd/cli.py =="
sed -n '3680,3860p' src/gpd/cli.py
echo
echo "== Search init payload schema for a 'strict' field =="
rg -n --context 2 "raw.*init|init\(" src/gpd/core src/gpd/cli.py | head -n 200
rg -n --context 2 "\"strict\"|strict:" src/gpd/core | head -n 200
echo
echo "== Search for how command-spec args are parsed/passed to init execution =="
rg -n --context 2 "display_signature|argument-hint|command spec|frontmatter|allowed-tools|parse.*frontmatter" src/gpd/core src/gpd/cli.py | head -n 200
echo
echo "== Find how command name 'deslop-paper' is loaded into init execution =="
rg -n --context 2 "load.*command|registry.*command|commands/.*\.md|deslop-paper" src/gpd/core src/gpd/cli.py | head -n 200Repository: psi-oss/get-physics-done
Length of output: 563
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find where DESLOP-SUMMARY.json is read/parsed =="
rg -n --context 3 "DESLOP-SUMMARY\.json|DESLOP-FLAGS\.md|DESLOP-AUDIT\.md|DESLOP-AUDIT\.jsonl" src/gpd -S
echo
echo "== Find any strict/flags exit logic for deslopification artifacts =="
rg -n --context 3 "(deslop|DESLOP).*(strict|flag|blocker|gate_status)" src/gpd -S
echo
echo "== Find any generic exit condition logic keyed by gate_status for deslopification =="
rg -n --context 3 "gate_status.*(strict|flag|blocker)|release_blocker_count" src/gpd -SRepository: psi-oss/get-physics-done
Length of output: 34070
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for literal CLI option '--strict' in gpd CLI implementation =="
rg -n --context 3 -- "--strict" src/gpd/cli.py
echo
echo "== Search for strict handling specifically tied to deslop-paper/deslopification in src/gpd =="
rg -n --context 3 "(deslop-paper|deslopification).*strict|strict.*(deslop-paper|deslopification)" src/gpd -S
echo
echo "== Search for 'strict' in the deslopification gate/workflow templates =="
rg -n --context 3 "deslopification-gate|DESLOP|deslop scan|deslop-paper|gate_status" src/gpd/specs/workflows src/gpd/commands/deslop-paper.md -SRepository: psi-oss/get-physics-done
Length of output: 37731
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Exit/strict logic around gate_status & release_blocker_count (deslop) =="
rg -n --context 3 "(deslop|DESLOP|deslopification).*?(gate_status|release_blocker_count|flag_count)" src/gpd -S
echo
echo "== Search for strict gating keyed by 'flag'/'blocker' in general workflow runner =="
rg -n --context 3 "strict.*(gate_status|release_blocker_count|flag_count|FLAG|blocker)" src/gpd -S
echo
echo "== Search for generic command-spec 'strict' parameter propagation into init/run =="
rg -n --context 3 "(raw init|init\\s+|command spec|frontmatter|display_signature|argument-hint).*strict" src/gpd -S
echo
echo "== Search for deslop-paper used as an authority/command spec in registry =="
rg -n --context 3 "deslop-paper" src/gpd/core src/gpd/cli.py -SRepository: psi-oss/get-physics-done
Length of output: 9401
Fix misleading --strict documentation/behavior for gpd:deslop-paper
src/gpd/commands/deslop-paper.md claims --strict makes the command fail when any unresolved FLAG remains (non-zero exit), but the deslopification implementation only returns non-zero when gate_status is blocked (i.e., release blockers), with no strict input anywhere in the deslopification scan flow. The workflow wiring likewise gates on release_blocker_count, not unresolved non-blocker flags—so the --strict semantics described in the spec/snippets at lines 33-34, 43-44, and 76-77 are not actually enforced. Update the snippets/spec to match reality or wire --strict end-to-end (init → scan/check → exit logic).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/commands/deslop-paper.md` around lines 33 - 34, The docs claim
gpd:deslop-paper --strict fails on any unresolved FLAG but the code only checks
gate_status/release_blocker_count; fix by wiring --strict end-to-end: add a
--strict boolean to the gpd:deslop-paper CLI handler and pass it into the
deslopification scan flow (propagate through the function/method that performs
deslopification), compute unresolved_flags_count (count of unresolved
non-blocker flags) during the scan alongside release_blocker_count and
gate_status, and update the final exit logic to return non-zero when --strict is
true and unresolved_flags_count > 0 (while preserving existing behavior when
--strict is false). Alternatively, if you prefer not to change behavior, update
the doc snippets to remove/clarify --strict and state that only
release_blocker_count/gate_status cause non-zero exit; make changes in the
gpd:deslop-paper docs and any workflow wiring that currently gates on
release_blocker_count to reflect the chosen semantics.
| try: | ||
| from gpd.core.arxiv_package import ( # type: ignore | ||
| _CITATION_RE as _CITATION_RE, | ||
| _PLACEHOLDER_RE as _PLACEHOLDER_RE, | ||
| _strip_latex_comments as _strip_latex_comments, | ||
| ) | ||
| except Exception: # standalone / demo path |
There was a problem hiding this comment.
Fail fast on internal import bugs instead of silently using demo fallbacks.
These except Exception blocks also catch runtime errors raised inside gpd.core.arxiv_package / gpd.core.utils, so a broken in-repo dependency quietly downgrades the gate to the local fallback regexes and file helpers. That can change what gets protected or written without any visible failure.
Suggested fix
-try:
+try:
from gpd.core.arxiv_package import ( # type: ignore
_CITATION_RE as _CITATION_RE,
_PLACEHOLDER_RE as _PLACEHOLDER_RE,
_strip_latex_comments as _strip_latex_comments,
)
-except Exception: # standalone / demo path
+except ImportError: # standalone / demo path-try:
+try:
from gpd.core.utils import atomic_write as _atomic_write, safe_read_file as _safe_read # type: ignore
-except Exception:
+except ImportError:Also applies to: 58-60
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/deslopification.py` around lines 32 - 38, The current broad
"except Exception" around importing _CITATION_RE, _PLACEHOLDER_RE, and
_strip_latex_comments from gpd.core.arxiv_package (and the similar block at
lines ~58-60) masks runtime errors inside those modules; change the handlers to
only catch ImportError/ModuleNotFoundError (e.g., "except ImportError:" or
"except (ImportError, ModuleNotFoundError) as e:") and let any other Exception
propagate (or re-raise) so in-repo failures fail fast instead of silently
falling back to demo replacements.
| _INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL) | ||
| _PAREN_MATH = re.compile(r"\\\((.+?)\\\)", re.DOTALL) | ||
| _DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL) | ||
| _ENV_MATH = re.compile( | ||
| r"\\begin\{(equation\*?|align\*?|gather\*?|multline\*?|eqnarray\*?)\}(.+?)\\end\{\1\}", re.DOTALL | ||
| ) |
There was a problem hiding this comment.
$$...$$ display math is currently invisible to the invariant checker.
extract_protected_spans never records raw TeX display math, so a change like $$x=1$$ → $$x=2$$ slips through as passed=True. That leaves a direct hole in the “math never changes” guarantee.
Suggested fix
+_DOUBLE_DOLLAR_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)
_INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL)
_PAREN_MATH = re.compile(r"\\\((.+?)\\\)", re.DOTALL)
_DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL)- for rx in (_INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH):
+ for rx in (_DOUBLE_DOLLAR_MATH, _INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH):
math.update(_norm(m.group(1)) for m in rx.finditer(body))Also applies to: 104-106
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/deslopification.py` around lines 78 - 83, The invariant checker
misses TeX display math delimited by $$...$$ because extract_protected_spans
never records that pattern; add a regex for dollar-delimited display math (e.g.,
a new _DOLLAR_DISPLAY_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$",
re.DOTALL)) and ensure extract_protected_spans treats _DOLLAR_DISPLAY_MATH the
same as _DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and
excluded from the change comparison; update any other places referencing
_DISPLAY_MATH (and the similar patterns around lines where
_PAREN_MATH/_DISPLAY_MATH/_ENV_MATH are defined) so the new symbol is included
in the protected-span extraction logic.
| </math_physics_tell_catalogue> | ||
|
|
||
| <concept_intro_gate> | ||
| For every nonstandard term/symbol, the notation ledger records `{object, kind, first_use, | ||
| first_definition, motivation_before_definition, example_present, nonexample_or_boundary_case_present, | ||
| used_in_theorem_or_proof, one_use_only, status}`. Fail-closed checks (flag, do not auto-fix): | ||
|
|
||
| ``` | ||
| CONCEPT_INTRO_GATE: | ||
| require first_definition <= first_technical_use | ||
| require motivation before definition unless locally standard | ||
| require example or boundary case for new named machinery | ||
| require symbol-collision check against universal conventions (∇, ∂, ℏ, ...) | ||
| flag one-use symbols | ||
| ``` |
There was a problem hiding this comment.
Fix markdown/table and fenced-code lint violations in this section.
Line 91 breaks table pipe/column rules, and Line 98 has a fenced block without language. Also, true_is_forbidden is not valid JSON and may confuse schema readers.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing leading pipe
(MD055, table-pipe-style)
[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing trailing pipe
(MD055, table-pipe-style)
[warning] 91-91: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data
(MD056, table-column-count)
[warning] 98-98: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines 91
- 105, Fix the markup in the CONCEPT_INTRO_GATE block: correct the broken
table/pipe syntax (ensure rows use consistent | column separators or remove the
stray table delimiter around the `<concept_intro_gate>` section), add a language
identifier to the fenced code block (e.g., ```yaml) to satisfy fenced-code
linting, and replace the invalid token `true_is_forbidden` with valid JSON/YAML
boolean or a quoted string (e.g., true or "true_is_forbidden") so schema readers
can parse the gate definition; refer to the CONCEPT_INTRO_GATE block and the
fenced code block shown in the diff when making these edits.
| `ci` mode sets `blocked` if any release blocker remains: public-facing scaffolding leakage, | ||
| placeholder/submission-time-check citations, missing/incomplete audit coverage, or unresolved | ||
| notation/concept-order flags. | ||
| </gate_status> |
There was a problem hiding this comment.
ci blocking semantics here exceed what the engine currently enforces.
This section says ci blocks for missing audit coverage and unresolved notation/concept-order flags, but src/gpd/core/deslopification.py:307-339 sets gate_status="blocked" only from release_blocker findings produced by tell detectors. That contract gap can let CI pass when this spec says it must fail.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines
126 - 129, The spec says `ci` should block for missing audit coverage and
unresolved notation/concept-order flags, but the gate_status is only set to
"blocked" when a finding is marked release_blocker; update the gate-status
calculation in the deslopification result assembly (the code that currently
assigns gate_status="blocked" from release_blocker findings) to also treat
findings representing missing audit coverage and unresolved
notation/concept-order flags as blocking: either treat those finding
types/severities as release_blocker-equivalent or add explicit checks that set
gate_status="blocked" when any finding has kind/type/flag indicating
missing_audit_coverage or notation/concept-order unresolved (reference the
variables/fields release_blocker, gate_status and the detector outputs from tell
detectors in the deslopification result code).
| "edit_id": "DSE-0042", | ||
| "location": {"file": "main.tex", "line_start": 281, "line_end": 286}, | ||
| "original": "Disposition tag. Per PFAFFIAN-APPLICABILITY.md §7.3, the disposition is (b) CONDITIONAL HOLDS...", | ||
| "new": "The decidability result is conditional on Conjecture CPA. Existing Pfaffian and cellular-decomposition results provide the framework, but the mirror-octic constants have not yet been computed.", | ||
| "tell_addressed": "agent_scaffolding_leakage", | ||
| "rationale": "Removes internal project-provenance language while preserving the public conditional status.", | ||
| "meaning_preserving": "yes", | ||
| "protected_spans_changed": false, | ||
| "claim_ledger_changed": false | ||
| } |
There was a problem hiding this comment.
Audit schema fields do not match the current producer output.
The schema requires location.file/line_start/line_end and claim_ledger_changed, but src/gpd/core/deslopification.py:253-276 currently writes location: {"line": ...} and no claim_ledger_changed. Also the markdown-rendered totals in src/gpd/core/deslopification.py:279-287 do not include claim_ledger_changed=0 as specified here. Please align spec and producer to one canonical shape.
Also applies to: 49-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/specs/templates/paper/deslop-audit-schema.md` around lines 13 - 22,
The audit entry producer in src/gpd/core/deslopification.py (around the block at
lines 253-276 and the markdown totals at 279-287) currently emits location:
{"line": ...} and omits claim_ledger_changed; update that code path to produce
the canonical shape expected by the spec: emit location as {"file": <filename>,
"line_start": <start>, "line_end": <end>} and add claim_ledger_changed (integer,
default 0) to each audit entry object, and also include claim_ledger_changed=0
in the markdown-rendered totals logic so the rendered totals match the schema.
| "severity": "blocker | major | minor", | ||
| "category": "rigor | notation | citation | physics | evidence | metadata | theorem_status", | ||
| "location": {"file": "references.tex", "line_start": 412, "line_end": 414}, | ||
| "excerpt": "Submission-time check: confirm exact published title...", | ||
| "why_not_auto_edited": "The correct bibliographic data must be verified from a source; inventing it would violate the no-misrepresentation rule.", | ||
| "recommended_author_action": "Verify the final published title, venue, volume, pages, DOI/arXiv identifier, then rerun bibliography audit.", | ||
| "blocks_public_release": true, | ||
| "delegate_to": "gpd-bibliographer" | ||
| } |
There was a problem hiding this comment.
DESLOP-FLAGS schema diverges from runtime output format.
src/gpd/core/deslopification.py:351-367 emits category as tell tags (e.g., placeholder_or_metadata), location as {line,col}, and omits recommended_author_action and delegate_to. This template defines a different required shape and taxonomy, so consumers cannot reliably parse both.
Also applies to: 26-35
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/specs/templates/paper/deslop-flags-schema.md` around lines 15 - 23,
The DESLOP-FLAGS template schema must match the runtime emission from
src/gpd/core/deslopification.py (around the code that emits DESLOP-FLAGS):
update this template to use the same "category" taxonomy (tell-tag values like
"placeholder_or_metadata"), change "location" to the runtime shape {line:int,
col:int} (instead of file/line_start/line_end), and remove or make optional the
"recommended_author_action" and "delegate_to" fields that the runtime omits so
consumers can parse consistently; ensure field names and types exactly mirror
what deslopification.py emits.
| 3. If `release_blocker_count > 0` (`gate_status: blocked`): do NOT proceed to peer | ||
| review. | ||
| - `autonomy=yolo`: record blockers in `CRITIQUE-LOG.md` and `gpd_return.issues`; | ||
| recommend the author actions in `DESLOP-FLAGS.md` (resolve placeholder citations, | ||
| scaffolding leakage, notation order). | ||
| - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location, | ||
| why, recommended action) and ask whether to (1) resolve now via the delegated | ||
| owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this | ||
| gate after resolution. |
There was a problem hiding this comment.
Resolve the blocker-decision contradiction in supervised/balanced mode.
Line 117 says blocked manuscripts must not proceed, but Line 123 includes “accept as a known limitation,” which conflicts with fail-closed semantics and with Line 125 (“Re-run this gate after resolution”). Remove or redefine option (2) so every allowed path is consistent with blocker resolution before proceeding.
Suggested spec-tightening diff
- - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
- why, recommended action) and ask whether to (1) resolve now via the delegated
- owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this
- gate after resolution.
+ - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
+ why, recommended action) and ask whether to (1) resolve now via the delegated
+ owner, or (2) hold the manuscript. Re-run this gate after resolution.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md`
around lines 117 - 125, The spec currently contradicts itself for
supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked)
the gate must fail-closed and not proceed, but the second option "accept as a
known limitation" allows proceeding; update the supervised|balanced branch (the
block describing autonomy=supervised|balanced and the listed options) to remove
or redefine option (2) so it enforces blocker resolution before
proceeding—either delete the "accept as a known limitation" choice or change it
to a non-proceeding state (e.g., "record as known limitation and put manuscript
on hold") and ensure the text explicitly requires re-running this gate after
resolution; reference release_blocker_count, gate_status,
autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making
the change.
What this adds
A deslopification gate for GPD. It takes a finalized AI-written math or physics manuscript and makes it read like expert work without changing the science, with a by-line audit, and with substantive issues flagged rather than rewritten.
How it works
The deterministic core,
src/gpd/core/deslopification.py, extracts a manuscript's protected spans (math,\citekeys, in-math numbers, theorem status), detects AI and agent writing tells with line locations, and provides an invariant checker. The checker re-extracts the protected spans after a proposed edit and rejects the edit if any of them changed, so a prose rewrite cannot alter the science undetected.The
gpd-discipline-editoragent performs the nuanced rewrites, and every edit it proposes is verified by the same checker. The deterministic apply mode lands only the edits it can prove safe and writes the by-line audit. Anything substantive, such as a placeholder citation or undefined notation, is flagged rather than rewritten, since rewriting plausible-but-wrong content would remove the signals that warn a reviewer.Commands
Where it runs
The gate runs after the reward-hacking integrity gate and before peer review in the write-paper publication-review finalization, and as a standalone command. gpd:arxiv-submission blocks on unresolved deslopification release blockers.
Tests
tests/core/test_deslopification.py and tests/test_deslop_cli_commands.py cover span extraction, tell detection, the invariant checker (including that it rejects a planted change to a math span, a citation, a number, or theorem status), and the CLI. The existing CLI validation tests still pass.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
deslop scanidentifies AI language and release blockers;deslop checkverifies that text edits preserve scientific contentvalidate deslop-invariantscommand for invariant verificationDocumentation
Tests