Skip to content

Add a deslopification gate for finalized manuscripts#256

Open
davidturturean wants to merge 1 commit into
psi-oss:mainfrom
davidturturean:add-deslopification-gate
Open

Add a deslopification gate for finalized manuscripts#256
davidturturean wants to merge 1 commit into
psi-oss:mainfrom
davidturturean:add-deslopification-gate

Conversation

@davidturturean

@davidturturean davidturturean commented Jun 4, 2026

Copy link
Copy Markdown

What this adds

A deslopification gate for GPD. It takes a finalized AI-written math or physics manuscript and makes it read like expert work without changing the science, with a by-line audit, and with substantive issues flagged rather than rewritten.

How it works

The deterministic core, src/gpd/core/deslopification.py, extracts a manuscript's protected spans (math, \cite keys, in-math numbers, theorem status), detects AI and agent writing tells with line locations, and provides an invariant checker. The checker re-extracts the protected spans after a proposed edit and rejects the edit if any of them changed, so a prose rewrite cannot alter the science undetected.

The gpd-discipline-editor agent performs the nuanced rewrites, and every edit it proposes is verified by the same checker. The deterministic apply mode lands only the edits it can prove safe and writes the by-line audit. Anything substantive, such as a placeholder citation or undefined notation, is flagged rather than rewritten, since rewriting plausible-but-wrong content would remove the signals that warn a reviewer.

Commands

gpd deslop scan <manuscript> --mode audit|apply|ci
gpd deslop check <before> <after>
gpd validate deslop-invariants <before> <after>
gpd:deslop-paper [manuscript] [--mode audit|apply|ci]

Where it runs

The gate runs after the reward-hacking integrity gate and before peer review in the write-paper publication-review finalization, and as a standalone command. gpd:arxiv-submission blocks on unresolved deslopification release blockers.

Tests

tests/core/test_deslopification.py and tests/test_deslop_cli_commands.py cover span extraction, tell detection, the invariant checker (including that it rejects a planted change to a math span, a citation, a number, or theorem status), and the CLI. The existing CLI validation tests still pass.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added manuscript deslopification tools: deslop scan identifies AI language and release blockers; deslop check verifies that text edits preserve scientific content
    • Integrated deslopification gate into publication workflow, generating audit trails and flagged issues
    • Added validate deslop-invariants command for invariant verification
  • Documentation

    • Added specifications for deslopification workflows, audit schemas, and flag definitions
  • Tests

    • Added comprehensive test coverage for deslopification engine and CLI commands

A meaning-preserving pass that strips public-facing AI and agent writing tells
from a manuscript while freezing all scientific content, emits a by-line audit,
and flags substantive issues instead of rewriting them.

Deterministic core (core/deslopification.py): protected-span extraction, located
tell detection, and an invariant checker that rejects any edit which changes a
math span, citation key, in-math number, or theorem status. The
gpd-discipline-editor agent does the nuanced rewrites, each gated by the same
checker. Adds gpd deslop scan/check, gpd validate deslop-invariants, and the
gpd:deslop-paper command; wires the gate into write-paper finalization and the
arxiv-submission blocking conditions. 11 tests; existing CLI tests still pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a complete deslopification system that removes public-facing AI/agent artifacts from academic manuscripts while preserving scientific content through frozen invariants. The implementation includes a deterministic engine, detailed specifications, command orchestration, CLI integration, and comprehensive test coverage.

Changes

Deslopification Gate Feature

Layer / File(s) Summary
Deslopification Engine Implementation and Tests
src/gpd/core/deslopification.py, tests/core/test_deslopification.py
Core module implementing protected-span extraction (math, citations, status tokens), tell detection with line/column locations and routing (KEEP/EDIT/FLAG), invariant checking via multiset diffing, and deterministic regex-driven style edits (only when invariants pass). Generates audit/flag/summary artifacts. Tests validate tell detection, invariant enforcement across prose-only edits and protected-span drifts, and manuscript scanning with placeholder blocking.
Gate Specification and Artifact Schemas
src/gpd/specs/references/publication/deslopification-gate.md, src/gpd/specs/templates/paper/deslop-audit-schema.md, src/gpd/specs/templates/paper/deslop-flags-schema.md
Defines semantics of deslopification gate with KEEP/EDIT/FLAG routing, invariant-check rules forbidding new claims and protected-span mutation, and mandatory audit/flags/summary artifact emission with fail-closed behavior. Specifies JSONL audit records (per-edit with edit_id, location, original, new, tell_addressed, rationale) rendered as human Markdown tables, and flags schema with severity/category enums and release-blocker controls.
Agent and Command Specifications
src/gpd/agents/gpd-discipline-editor.md, src/gpd/commands/deslop-paper.md
Agent spec defines owned artifacts, four-pass control flow (freeze spans, route, apply invariant-gated edits, emit artifacts), and critical rules (no invention, flag substantive issues, every edit audited). Command spec documents audit/apply/ci modes, --strict behavior, initialization from gpd --raw init, delegation to agent with four-pass prompt, and success criteria including clean gate status and zero release blockers.
CLI Commands and Testing
src/gpd/cli.py, tests/test_deslop_cli_commands.py
Adds deslop_app command group with scan (runs detection/edits, write artifacts), check (validates invariants, exit code 2 on failure), and validate deslop-invariants alias delegating to check. Tests invoke CLI against fixture .tex files and assert exit codes, gate status, blocker detection, and invariant rejection for science-span drift.
Workflow Integration and Submission Policy
src/gpd/commands/arxiv-submission.md, src/gpd/specs/workflows/write-paper/publication-review-finalization.md
Integrates deslopification gate into publication-review-finalization workflow between integrity gating and pre_submission_review. Gate calls gpd-discipline-editor in apply mode, writes audit/flag/summary artifacts, consumes DESLOP-SUMMARY.json to decide peer-review blocking, logs outcomes to CRITIQUE-LOG.md, and handles release blockers via autonomy-based re-run or user decision. Updates arxiv-submission to treat unresolved blockers as submission blockers with remediation command.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • madeleinesong
  • cmaloney111

Poem

🐰 A Gate to Guard the Prose

With frozen spans and tells to find,
The deslop-editor rewinds,
Each math and cite held crystal-clear,
While prose alone may disappear.
No claims shall slip past audit's eye—
Just style and taste, preserved and dry! 🎀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add a deslopification gate for finalized manuscripts' clearly summarizes the main objective of the PR—implementing a deslopification gate—and is concise and specific.
Description check ✅ Passed The PR description comprehensively covers what changed, why it was added, how it works, the CLI commands, integration points, and testing. It follows the structure expected in the template with dedicated sections for changes, motivation, and testing, exceeding the basic requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@marcpickett1

Copy link
Copy Markdown
Collaborator

🤖 RoastBot: You wrote a feature to detect slop... in an AI tool... using an AI. The ouroboros of self-loathing is complete.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/gpd/agents/gpd-discipline-editor.md`:
- Around line 64-65: The documentation requires DESLOP-AUDIT.jsonl and
DESLOP-AUDIT.md to be emitted unconditionally, but the current implementation in
src/gpd/core/deslopification.py only writes them inside the apply-mode block
(lines ~321-339) after invariant checks; reconcile by extracting the
audit-emission logic into a shared routine (e.g., emit_deslop_audit or
write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure
it sets meaning_preserving: yes and writes one record per edit), while keeping
the edited .tex write only in apply() and preserving the invariants behavior (or
log invariant failures while still emitting the audit artifacts in audit mode if
that matches the contract).

In `@src/gpd/cli.py`:
- Around line 9552-9566: The deslop_check handler currently coerces unreadable
files to empty strings and derives is_tex from the raw argument, which can allow
missing inputs to slip past checks; instead, resolve each path with
_resolve_path_from_effective_cwd, call safe_read_file and if it returns None
treat that as a hard error (emit a clear message via typer.echo/process logger
and exit with a non-zero code) rather than substituting "", then pass the actual
file contents to check_invariants; also determine is_tex from the resolved
path's suffix in a case-insensitive way (e.g., resolved_path.suffix.lower() ==
".tex") before calling check_invariants.

In `@src/gpd/commands/deslop-paper.md`:
- Around line 33-34: The docs claim gpd:deslop-paper --strict fails on any
unresolved FLAG but the code only checks gate_status/release_blocker_count; fix
by wiring --strict end-to-end: add a --strict boolean to the gpd:deslop-paper
CLI handler and pass it into the deslopification scan flow (propagate through
the function/method that performs deslopification), compute
unresolved_flags_count (count of unresolved non-blocker flags) during the scan
alongside release_blocker_count and gate_status, and update the final exit logic
to return non-zero when --strict is true and unresolved_flags_count > 0 (while
preserving existing behavior when --strict is false). Alternatively, if you
prefer not to change behavior, update the doc snippets to remove/clarify
--strict and state that only release_blocker_count/gate_status cause non-zero
exit; make changes in the gpd:deslop-paper docs and any workflow wiring that
currently gates on release_blocker_count to reflect the chosen semantics.

In `@src/gpd/core/deslopification.py`:
- Around line 78-83: The invariant checker misses TeX display math delimited by
$$...$$ because extract_protected_spans never records that pattern; add a regex
for dollar-delimited display math (e.g., a new _DOLLAR_DISPLAY_MATH =
re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)) and ensure
extract_protected_spans treats _DOLLAR_DISPLAY_MATH the same as
_DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and excluded
from the change comparison; update any other places referencing _DISPLAY_MATH
(and the similar patterns around lines where _PAREN_MATH/_DISPLAY_MATH/_ENV_MATH
are defined) so the new symbol is included in the protected-span extraction
logic.
- Around line 307-310: The code currently treats read failures as empty
manuscripts by using `_safe_read(path) or ""`; change this to fail hard: call
`_safe_read(path)`, check if the result is None (or an explicit error sentinel)
and if so raise an appropriate exception (e.g., FileNotFoundError or IOError)
with the path and underlying error info instead of substituting ""; apply the
same fix to the other occurrence that uses `_safe_read` (the similar block
around lines 385-388) so both `scan_manuscript` and the duplicate reader call
stop returning false-clean results for unreadable files.
- Around line 32-38: The current broad "except Exception" around importing
_CITATION_RE, _PLACEHOLDER_RE, and _strip_latex_comments from
gpd.core.arxiv_package (and the similar block at lines ~58-60) masks runtime
errors inside those modules; change the handlers to only catch
ImportError/ModuleNotFoundError (e.g., "except ImportError:" or "except
(ImportError, ModuleNotFoundError) as e:") and let any other Exception propagate
(or re-raise) so in-repo failures fail fast instead of silently falling back to
demo replacements.

In `@src/gpd/specs/references/publication/deslopification-gate.md`:
- Around line 91-105: Fix the markup in the CONCEPT_INTRO_GATE block: correct
the broken table/pipe syntax (ensure rows use consistent | column separators or
remove the stray table delimiter around the `<concept_intro_gate>` section), add
a language identifier to the fenced code block (e.g., ```yaml) to satisfy
fenced-code linting, and replace the invalid token `true_is_forbidden` with
valid JSON/YAML boolean or a quoted string (e.g., true or "true_is_forbidden")
so schema readers can parse the gate definition; refer to the CONCEPT_INTRO_GATE
block and the fenced code block shown in the diff when making these edits.
- Around line 126-129: The spec says `ci` should block for missing audit
coverage and unresolved notation/concept-order flags, but the gate_status is
only set to "blocked" when a finding is marked release_blocker; update the
gate-status calculation in the deslopification result assembly (the code that
currently assigns gate_status="blocked" from release_blocker findings) to also
treat findings representing missing audit coverage and unresolved
notation/concept-order flags as blocking: either treat those finding
types/severities as release_blocker-equivalent or add explicit checks that set
gate_status="blocked" when any finding has kind/type/flag indicating
missing_audit_coverage or notation/concept-order unresolved (reference the
variables/fields release_blocker, gate_status and the detector outputs from tell
detectors in the deslopification result code).

In `@src/gpd/specs/templates/paper/deslop-audit-schema.md`:
- Around line 13-22: The audit entry producer in src/gpd/core/deslopification.py
(around the block at lines 253-276 and the markdown totals at 279-287) currently
emits location: {"line": ...} and omits claim_ledger_changed; update that code
path to produce the canonical shape expected by the spec: emit location as
{"file": <filename>, "line_start": <start>, "line_end": <end>} and add
claim_ledger_changed (integer, default 0) to each audit entry object, and also
include claim_ledger_changed=0 in the markdown-rendered totals logic so the
rendered totals match the schema.

In `@src/gpd/specs/templates/paper/deslop-flags-schema.md`:
- Around line 15-23: The DESLOP-FLAGS template schema must match the runtime
emission from src/gpd/core/deslopification.py (around the code that emits
DESLOP-FLAGS): update this template to use the same "category" taxonomy
(tell-tag values like "placeholder_or_metadata"), change "location" to the
runtime shape {line:int, col:int} (instead of file/line_start/line_end), and
remove or make optional the "recommended_author_action" and "delegate_to" fields
that the runtime omits so consumers can parse consistently; ensure field names
and types exactly mirror what deslopification.py emits.

In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md`:
- Around line 117-125: The spec currently contradicts itself for
supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked)
the gate must fail-closed and not proceed, but the second option "accept as a
known limitation" allows proceeding; update the supervised|balanced branch (the
block describing autonomy=supervised|balanced and the listed options) to remove
or redefine option (2) so it enforces blocker resolution before
proceeding—either delete the "accept as a known limitation" choice or change it
to a non-proceeding state (e.g., "record as known limitation and put manuscript
on hold") and ensure the text explicitly requires re-running this gate after
resolution; reference release_blocker_count, gate_status,
autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making
the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 01cacca5-0aa3-4885-a092-6eaf27968ea1

📥 Commits

Reviewing files that changed from the base of the PR and between 0f41769 and b502105.

📒 Files selected for processing (11)
  • src/gpd/agents/gpd-discipline-editor.md
  • src/gpd/cli.py
  • src/gpd/commands/arxiv-submission.md
  • src/gpd/commands/deslop-paper.md
  • src/gpd/core/deslopification.py
  • src/gpd/specs/references/publication/deslopification-gate.md
  • src/gpd/specs/templates/paper/deslop-audit-schema.md
  • src/gpd/specs/templates/paper/deslop-flags-schema.md
  • src/gpd/specs/workflows/write-paper/publication-review-finalization.md
  • tests/core/test_deslopification.py
  • tests/test_deslop_cli_commands.py

Comment on lines +64 to +65
Pass D; Emit the artifacts (non-optional): `DESLOP-AUDIT.jsonl` + `DESLOP-AUDIT.md` (one record per edit, by line, with `meaning_preserving: yes`), `DESLOP-FLAGS.md`, and `DESLOP-SUMMARY.json`. In `apply` mode also write the edited `.tex`; in `audit` mode do not touch the manuscript.
</method>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

“Non-optional artifacts” requirement conflicts with current apply-only audit emission.

This says DESLOP-AUDIT.jsonl and .md are always emitted, but src/gpd/core/deslopification.py:321-339 writes those only in apply mode (and only if invariants pass). Please reconcile this requirement so the agent contract matches executable behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/agents/gpd-discipline-editor.md` around lines 64 - 65, The
documentation requires DESLOP-AUDIT.jsonl and DESLOP-AUDIT.md to be emitted
unconditionally, but the current implementation in
src/gpd/core/deslopification.py only writes them inside the apply-mode block
(lines ~321-339) after invariant checks; reconcile by extracting the
audit-emission logic into a shared routine (e.g., emit_deslop_audit or
write_deslop_audit_jsonl/md) and call it for both apply and audit modes (ensure
it sets meaning_preserving: yes and writes one record per edit), while keeping
the edited .tex write only in apply() and preserving the invariants behavior (or
log invariant failures while still emitting the audit artifacts in audit mode if
that matches the contract).

Comment thread src/gpd/cli.py
Comment on lines +9552 to +9566
@deslop_app.command("check")
def deslop_check(
before: str = typer.Argument(..., help="Path to the pre-edit text"),
after: str = typer.Argument(..., help="Path to the post-edit text"),
) -> None:
"""Prove an edit changed no protected span (math, citations, numbers, theorem status)."""
from gpd.core.deslopification import check_invariants
from gpd.core.utils import safe_read_file

b = safe_read_file(_resolve_path_from_effective_cwd(before)) or ""
a = safe_read_file(_resolve_path_from_effective_cwd(after)) or ""
report = check_invariants(b, a, is_tex=before.endswith(".tex"))
_output(report)
if not report["passed"]:
raise typer.Exit(code=2)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on unreadable inputs in deslop_check (don’t coerce to empty text).

Using safe_read_file(... ) or "" can let missing/unreadable files be treated as empty content, which can incorrectly pass invariant checks and bypass the gate. Also, derive is_tex from the resolved path suffix (case-insensitive) instead of raw string matching.

Suggested fix
 `@deslop_app.command`("check")
 def deslop_check(
@@
 ) -> None:
@@
-    from gpd.core.deslopification import check_invariants
-    from gpd.core.utils import safe_read_file
-
-    b = safe_read_file(_resolve_path_from_effective_cwd(before)) or ""
-    a = safe_read_file(_resolve_path_from_effective_cwd(after)) or ""
-    report = check_invariants(b, a, is_tex=before.endswith(".tex"))
+    from gpd.core.deslopification import check_invariants
+
+    before_path, b = _load_text_document_or_error(before)
+    _after_path, a = _load_text_document_or_error(after)
+    report = check_invariants(b, a, is_tex=before_path.suffix.lower() == ".tex")
     _output(report)
     if not report["passed"]:
         raise typer.Exit(code=2)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/cli.py` around lines 9552 - 9566, The deslop_check handler currently
coerces unreadable files to empty strings and derives is_tex from the raw
argument, which can allow missing inputs to slip past checks; instead, resolve
each path with _resolve_path_from_effective_cwd, call safe_read_file and if it
returns None treat that as a hard error (emit a clear message via
typer.echo/process logger and exit with a non-zero code) rather than
substituting "", then pass the actual file contents to check_invariants; also
determine is_tex from the resolved path's suffix in a case-insensitive way
(e.g., resolved_path.suffix.lower() == ".tex") before calling check_invariants.

Comment on lines +33 to +34
gpd:deslop-paper GPD/publication/<slug>/manuscript --mode apply --strict
gpd:deslop-paper <manuscript> --mode ci # fail if release-blocking slop remains

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate deslop-paper command/spec wiring:"
rg -n -C3 'deslop-paper|--strict|strict' src tests

echo
echo "Locate CLI options for deslop-related commands:"
rg -n -C3 '@.*command\(|def .*deslop|--mode|--strict|no-write' src/gpd/cli.py

echo
echo "Locate strict-mode assertions in tests:"
rg -n -C3 'deslop.*strict|--strict|gate_status|blocked|exit code|non-zero' tests

Repository: psi-oss/get-physics-done

Length of output: 50382


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Find all occurrences of the command name `deslop-paper`:"
rg -n "deslop-paper" src tests

echo
echo "Find all occurrences of `deslop` in the CLI entrypoint:"
rg -n "deslop" src/gpd/cli.py

echo
echo "Search CLI option declarations for `deslop`-related commands (narrow patterns):"
rg -n --context 3 "(deslop.*command|command.*deslop|def .*deslop|--mode|--no-write|--strict)" src/gpd/cli.py

echo
echo "Search for strict parsing/enforcement around deslop in the CLI tree:"
rg -n --context 3 "(deslop.*strict|strict.*deslop|--strict.*deslop|deslop.*--strict)" src/gpd

Repository: psi-oss/get-physics-done

Length of output: 6323


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate core deslopification implementation for scan/check:"
fd -t f "deslopification*" src/gpd/core || true
rg -n "def (scan_manuscript|check_invariants)\b|scan_manuscript\(" -S src/gpd/core

echo
echo "Find strict handling inside deslopification core:"
rg -n "strict" -S src/gpd/core/deslopification.py src/gpd/core || true

echo
echo "Inspect deslop-paper command spec template/wiring that runs `gpd --raw init deslop-paper`:"
sed -n '1,140p' src/gpd/commands/deslop-paper.md

Repository: psi-oss/get-physics-done

Length of output: 50441


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect deslop-paper spec ==="
sed -n '1,140p' src/gpd/commands/deslop-paper.md
echo
sed -n '140,260p' src/gpd/commands/deslop-paper.md || true

echo
echo "=== Inspect deslopification core ==="
sed -n '1,220p' src/gpd/core/deslopification.py
echo
sed -n '220,520p' src/gpd/core/deslopification.py

echo
echo "=== Search for strict handling inside core/deslopification.py ==="
rg -n "strict" src/gpd/core/deslopification.py

echo
echo "=== Trace how `init deslop-paper` is dispatched/parsed ==="
rg -n "init deslop-paper|deslop-paper" src/gpd | head -n 200
rg -n "deslopification|scan_manuscript|check_invariants" src/gpd | head -n 200

echo
echo "=== Search for scan_manuscript callers with strict kwarg ==="
rg -n "scan_manuscript\\(.*strict|check_invariants\\(.*strict|deslop.*strict" src/gpd/core | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 22946


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find strict usage in deslop-paper spec =="
rg -n "strict" src/gpd/commands/deslop-paper.md
echo
echo "== Show remaining parts of deslop-paper spec (possible exit logic) =="
sed -n '260,520p' src/gpd/commands/deslop-paper.md

echo
echo "== Search for deslop-paper strict behavior in code/tests =="
rg -n --no-heading "(deslop-paper|DESLOP-SUMMARY|DESLOP-FLAGS|gate_status: blocked|gate_status).*strict" src tests -S || true

echo
echo "== Search for any tests covering deslop-paper/strict =="
rg -n --no-heading "deslop-paper" tests -S || true

echo
echo "== Search for deslop-paper command handler / init dispatch =="
rg -n --no-heading "init deslop-paper|deslop-paper" src/gpd -S | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 2345


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find where gpd:deslop-paper is registered/mentioned in code =="
rg -n "gpd:deslop-paper|deslop-paper" src/gpd -S

echo
echo "== Find init command implementation (the `gpd --raw init deslop-paper` path) =="
rg -n "raw.*init|@.*init\\b|command\\(\"init\"\\)|def .*init\\b" src/gpd/cli.py src/gpd/core -S

echo
echo "== Check for a generic --strict option in CLI command initialization =="
rg -n "--strict" src/gpd/cli.py src/gpd/core/ -S

echo
echo "== Trace how command specs frontmatter like argument-hint/display_signature are turned into CLI params =="
rg -n "argument-hint|display_signature|registry.*command|parse.*command.*spec|allowed-tools" src/gpd/core src/gpd/registry* src/gpd/ -S | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 4367


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Where does CLI define --strict? (top-level and subcommands) =="
rg -n --context 2 "--strict" src/gpd/cli.py | head -n 200

echo
echo "== Inspect gpd --raw init implementation in src/gpd/cli.py =="
sed -n '3680,3860p' src/gpd/cli.py

echo
echo "== Search init payload schema for a 'strict' field =="
rg -n --context 2 "raw.*init|init\(" src/gpd/core src/gpd/cli.py | head -n 200
rg -n --context 2 "\"strict\"|strict:" src/gpd/core | head -n 200

echo
echo "== Search for how command-spec args are parsed/passed to init execution =="
rg -n --context 2 "display_signature|argument-hint|command spec|frontmatter|allowed-tools|parse.*frontmatter" src/gpd/core src/gpd/cli.py | head -n 200

echo
echo "== Find how command name 'deslop-paper' is loaded into init execution =="
rg -n --context 2 "load.*command|registry.*command|commands/.*\.md|deslop-paper" src/gpd/core src/gpd/cli.py | head -n 200

Repository: psi-oss/get-physics-done

Length of output: 563


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find where DESLOP-SUMMARY.json is read/parsed =="
rg -n --context 3 "DESLOP-SUMMARY\.json|DESLOP-FLAGS\.md|DESLOP-AUDIT\.md|DESLOP-AUDIT\.jsonl" src/gpd -S

echo
echo "== Find any strict/flags exit logic for deslopification artifacts =="
rg -n --context 3 "(deslop|DESLOP).*(strict|flag|blocker|gate_status)" src/gpd -S

echo
echo "== Find any generic exit condition logic keyed by gate_status for deslopification =="
rg -n --context 3 "gate_status.*(strict|flag|blocker)|release_blocker_count" src/gpd -S

Repository: psi-oss/get-physics-done

Length of output: 34070


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search for literal CLI option '--strict' in gpd CLI implementation =="
rg -n --context 3 -- "--strict" src/gpd/cli.py

echo
echo "== Search for strict handling specifically tied to deslop-paper/deslopification in src/gpd =="
rg -n --context 3 "(deslop-paper|deslopification).*strict|strict.*(deslop-paper|deslopification)" src/gpd -S

echo
echo "== Search for 'strict' in the deslopification gate/workflow templates =="
rg -n --context 3 "deslopification-gate|DESLOP|deslop scan|deslop-paper|gate_status" src/gpd/specs/workflows src/gpd/commands/deslop-paper.md -S

Repository: psi-oss/get-physics-done

Length of output: 37731


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Exit/strict logic around gate_status & release_blocker_count (deslop) =="
rg -n --context 3 "(deslop|DESLOP|deslopification).*?(gate_status|release_blocker_count|flag_count)" src/gpd -S

echo
echo "== Search for strict gating keyed by 'flag'/'blocker' in general workflow runner =="
rg -n --context 3 "strict.*(gate_status|release_blocker_count|flag_count|FLAG|blocker)" src/gpd -S

echo
echo "== Search for generic command-spec 'strict' parameter propagation into init/run =="
rg -n --context 3 "(raw init|init\\s+|command spec|frontmatter|display_signature|argument-hint).*strict" src/gpd -S

echo
echo "== Search for deslop-paper used as an authority/command spec in registry =="
rg -n --context 3 "deslop-paper" src/gpd/core src/gpd/cli.py -S

Repository: psi-oss/get-physics-done

Length of output: 9401


Fix misleading --strict documentation/behavior for gpd:deslop-paper
src/gpd/commands/deslop-paper.md claims --strict makes the command fail when any unresolved FLAG remains (non-zero exit), but the deslopification implementation only returns non-zero when gate_status is blocked (i.e., release blockers), with no strict input anywhere in the deslopification scan flow. The workflow wiring likewise gates on release_blocker_count, not unresolved non-blocker flags—so the --strict semantics described in the spec/snippets at lines 33-34, 43-44, and 76-77 are not actually enforced. Update the snippets/spec to match reality or wire --strict end-to-end (init → scan/check → exit logic).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/commands/deslop-paper.md` around lines 33 - 34, The docs claim
gpd:deslop-paper --strict fails on any unresolved FLAG but the code only checks
gate_status/release_blocker_count; fix by wiring --strict end-to-end: add a
--strict boolean to the gpd:deslop-paper CLI handler and pass it into the
deslopification scan flow (propagate through the function/method that performs
deslopification), compute unresolved_flags_count (count of unresolved
non-blocker flags) during the scan alongside release_blocker_count and
gate_status, and update the final exit logic to return non-zero when --strict is
true and unresolved_flags_count > 0 (while preserving existing behavior when
--strict is false). Alternatively, if you prefer not to change behavior, update
the doc snippets to remove/clarify --strict and state that only
release_blocker_count/gate_status cause non-zero exit; make changes in the
gpd:deslop-paper docs and any workflow wiring that currently gates on
release_blocker_count to reflect the chosen semantics.

Comment on lines +32 to +38
try:
from gpd.core.arxiv_package import ( # type: ignore
_CITATION_RE as _CITATION_RE,
_PLACEHOLDER_RE as _PLACEHOLDER_RE,
_strip_latex_comments as _strip_latex_comments,
)
except Exception: # standalone / demo path

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast on internal import bugs instead of silently using demo fallbacks.

These except Exception blocks also catch runtime errors raised inside gpd.core.arxiv_package / gpd.core.utils, so a broken in-repo dependency quietly downgrades the gate to the local fallback regexes and file helpers. That can change what gets protected or written without any visible failure.

Suggested fix
-try:
+try:
     from gpd.core.arxiv_package import (  # type: ignore
         _CITATION_RE as _CITATION_RE,
         _PLACEHOLDER_RE as _PLACEHOLDER_RE,
         _strip_latex_comments as _strip_latex_comments,
     )
-except Exception:  # standalone / demo path
+except ImportError:  # standalone / demo path
-try:
+try:
     from gpd.core.utils import atomic_write as _atomic_write, safe_read_file as _safe_read  # type: ignore
-except Exception:
+except ImportError:

Also applies to: 58-60

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/core/deslopification.py` around lines 32 - 38, The current broad
"except Exception" around importing _CITATION_RE, _PLACEHOLDER_RE, and
_strip_latex_comments from gpd.core.arxiv_package (and the similar block at
lines ~58-60) masks runtime errors inside those modules; change the handlers to
only catch ImportError/ModuleNotFoundError (e.g., "except ImportError:" or
"except (ImportError, ModuleNotFoundError) as e:") and let any other Exception
propagate (or re-raise) so in-repo failures fail fast instead of silently
falling back to demo replacements.

Comment on lines +78 to +83
_INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL)
_PAREN_MATH = re.compile(r"\\\((.+?)\\\)", re.DOTALL)
_DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL)
_ENV_MATH = re.compile(
r"\\begin\{(equation\*?|align\*?|gather\*?|multline\*?|eqnarray\*?)\}(.+?)\\end\{\1\}", re.DOTALL
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

$$...$$ display math is currently invisible to the invariant checker.

extract_protected_spans never records raw TeX display math, so a change like $$x=1$$ → $$x=2$$ slips through as passed=True. That leaves a direct hole in the “math never changes” guarantee.

Suggested fix
+_DOUBLE_DOLLAR_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$", re.DOTALL)
 _INLINE_MATH = re.compile(r"(?<!\\)\$(?!\$)(.+?)(?<!\\)\$", re.DOTALL)
 _PAREN_MATH = re.compile(r"\\\((.+?)\\\)", re.DOTALL)
 _DISPLAY_MATH = re.compile(r"\\\[(.+?)\\\]", re.DOTALL)
-    for rx in (_INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH):
+    for rx in (_DOUBLE_DOLLAR_MATH, _INLINE_MATH, _PAREN_MATH, _DISPLAY_MATH):
         math.update(_norm(m.group(1)) for m in rx.finditer(body))

Also applies to: 104-106

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/core/deslopification.py` around lines 78 - 83, The invariant checker
misses TeX display math delimited by $$...$$ because extract_protected_spans
never records that pattern; add a regex for dollar-delimited display math (e.g.,
a new _DOLLAR_DISPLAY_MATH = re.compile(r"(?<!\\)\$\$(.+?)(?<!\\)\$\$",
re.DOTALL)) and ensure extract_protected_spans treats _DOLLAR_DISPLAY_MATH the
same as _DISPLAY_MATH/_ENV_MATH/_INLINE_MATH so those spans are recorded and
excluded from the change comparison; update any other places referencing
_DISPLAY_MATH (and the similar patterns around lines where
_PAREN_MATH/_DISPLAY_MATH/_ENV_MATH are defined) so the new symbol is included
in the protected-span extraction logic.

Comment on lines +91 to +105
</math_physics_tell_catalogue>

<concept_intro_gate>
For every nonstandard term/symbol, the notation ledger records `{object, kind, first_use,
first_definition, motivation_before_definition, example_present, nonexample_or_boundary_case_present,
used_in_theorem_or_proof, one_use_only, status}`. Fail-closed checks (flag, do not auto-fix):

```
CONCEPT_INTRO_GATE:
require first_definition <= first_technical_use
require motivation before definition unless locally standard
require example or boundary case for new named machinery
require symbol-collision check against universal conventions (∇, ∂, ℏ, ...)
flag one-use symbols
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix markdown/table and fenced-code lint violations in this section.

Line 91 breaks table pipe/column rules, and Line 98 has a fenced block without language. Also, true_is_forbidden is not valid JSON and may confuse schema readers.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing leading pipe

(MD055, table-pipe-style)


[warning] 91-91: Table pipe style
Expected: leading_and_trailing; Actual: no_leading_or_trailing; Missing trailing pipe

(MD055, table-pipe-style)


[warning] 91-91: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data

(MD056, table-column-count)


[warning] 98-98: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines 91
- 105, Fix the markup in the CONCEPT_INTRO_GATE block: correct the broken
table/pipe syntax (ensure rows use consistent | column separators or remove the
stray table delimiter around the `<concept_intro_gate>` section), add a language
identifier to the fenced code block (e.g., ```yaml) to satisfy fenced-code
linting, and replace the invalid token `true_is_forbidden` with valid JSON/YAML
boolean or a quoted string (e.g., true or "true_is_forbidden") so schema readers
can parse the gate definition; refer to the CONCEPT_INTRO_GATE block and the
fenced code block shown in the diff when making these edits.

Comment on lines +126 to +129
`ci` mode sets `blocked` if any release blocker remains: public-facing scaffolding leakage,
placeholder/submission-time-check citations, missing/incomplete audit coverage, or unresolved
notation/concept-order flags.
</gate_status>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

ci blocking semantics here exceed what the engine currently enforces.

This section says ci blocks for missing audit coverage and unresolved notation/concept-order flags, but src/gpd/core/deslopification.py:307-339 sets gate_status="blocked" only from release_blocker findings produced by tell detectors. That contract gap can let CI pass when this spec says it must fail.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/references/publication/deslopification-gate.md` around lines
126 - 129, The spec says `ci` should block for missing audit coverage and
unresolved notation/concept-order flags, but the gate_status is only set to
"blocked" when a finding is marked release_blocker; update the gate-status
calculation in the deslopification result assembly (the code that currently
assigns gate_status="blocked" from release_blocker findings) to also treat
findings representing missing audit coverage and unresolved
notation/concept-order flags as blocking: either treat those finding
types/severities as release_blocker-equivalent or add explicit checks that set
gate_status="blocked" when any finding has kind/type/flag indicating
missing_audit_coverage or notation/concept-order unresolved (reference the
variables/fields release_blocker, gate_status and the detector outputs from tell
detectors in the deslopification result code).

Comment on lines +13 to +22
"edit_id": "DSE-0042",
"location": {"file": "main.tex", "line_start": 281, "line_end": 286},
"original": "Disposition tag. Per PFAFFIAN-APPLICABILITY.md §7.3, the disposition is (b) CONDITIONAL HOLDS...",
"new": "The decidability result is conditional on Conjecture CPA. Existing Pfaffian and cellular-decomposition results provide the framework, but the mirror-octic constants have not yet been computed.",
"tell_addressed": "agent_scaffolding_leakage",
"rationale": "Removes internal project-provenance language while preserving the public conditional status.",
"meaning_preserving": "yes",
"protected_spans_changed": false,
"claim_ledger_changed": false
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Audit schema fields do not match the current producer output.

The schema requires location.file/line_start/line_end and claim_ledger_changed, but src/gpd/core/deslopification.py:253-276 currently writes location: {"line": ...} and no claim_ledger_changed. Also the markdown-rendered totals in src/gpd/core/deslopification.py:279-287 do not include claim_ledger_changed=0 as specified here. Please align spec and producer to one canonical shape.

Also applies to: 49-50

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/templates/paper/deslop-audit-schema.md` around lines 13 - 22,
The audit entry producer in src/gpd/core/deslopification.py (around the block at
lines 253-276 and the markdown totals at 279-287) currently emits location:
{"line": ...} and omits claim_ledger_changed; update that code path to produce
the canonical shape expected by the spec: emit location as {"file": <filename>,
"line_start": <start>, "line_end": <end>} and add claim_ledger_changed (integer,
default 0) to each audit entry object, and also include claim_ledger_changed=0
in the markdown-rendered totals logic so the rendered totals match the schema.

Comment on lines +15 to +23
"severity": "blocker | major | minor",
"category": "rigor | notation | citation | physics | evidence | metadata | theorem_status",
"location": {"file": "references.tex", "line_start": 412, "line_end": 414},
"excerpt": "Submission-time check: confirm exact published title...",
"why_not_auto_edited": "The correct bibliographic data must be verified from a source; inventing it would violate the no-misrepresentation rule.",
"recommended_author_action": "Verify the final published title, venue, volume, pages, DOI/arXiv identifier, then rerun bibliography audit.",
"blocks_public_release": true,
"delegate_to": "gpd-bibliographer"
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

DESLOP-FLAGS schema diverges from runtime output format.

src/gpd/core/deslopification.py:351-367 emits category as tell tags (e.g., placeholder_or_metadata), location as {line,col}, and omits recommended_author_action and delegate_to. This template defines a different required shape and taxonomy, so consumers cannot reliably parse both.

Also applies to: 26-35

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/templates/paper/deslop-flags-schema.md` around lines 15 - 23,
The DESLOP-FLAGS template schema must match the runtime emission from
src/gpd/core/deslopification.py (around the code that emits DESLOP-FLAGS):
update this template to use the same "category" taxonomy (tell-tag values like
"placeholder_or_metadata"), change "location" to the runtime shape {line:int,
col:int} (instead of file/line_start/line_end), and remove or make optional the
"recommended_author_action" and "delegate_to" fields that the runtime omits so
consumers can parse consistently; ensure field names and types exactly mirror
what deslopification.py emits.

Comment on lines +117 to +125
3. If `release_blocker_count > 0` (`gate_status: blocked`): do NOT proceed to peer
review.
- `autonomy=yolo`: record blockers in `CRITIQUE-LOG.md` and `gpd_return.issues`;
recommend the author actions in `DESLOP-FLAGS.md` (resolve placeholder citations,
scaffolding leakage, notation order).
- `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
why, recommended action) and ask whether to (1) resolve now via the delegated
owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this
gate after resolution.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the blocker-decision contradiction in supervised/balanced mode.

Line 117 says blocked manuscripts must not proceed, but Line 123 includes “accept as a known limitation,” which conflicts with fail-closed semantics and with Line 125 (“Re-run this gate after resolution”). Remove or redefine option (2) so every allowed path is consistent with blocker resolution before proceeding.

Suggested spec-tightening diff
-  - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
-    why, recommended action) and ask whether to (1) resolve now via the delegated
-    owner, (2) accept as a known limitation, or (3) hold the manuscript. Re-run this
-    gate after resolution.
+  - `autonomy=supervised|balanced`: present the `DESLOP-FLAGS.md` blockers (location,
+    why, recommended action) and ask whether to (1) resolve now via the delegated
+    owner, or (2) hold the manuscript. Re-run this gate after resolution.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/workflows/write-paper/publication-review-finalization.md`
around lines 117 - 125, The spec currently contradicts itself for
supervised|balanced modes: when release_blocker_count > 0 (gate_status: blocked)
the gate must fail-closed and not proceed, but the second option "accept as a
known limitation" allows proceeding; update the supervised|balanced branch (the
block describing autonomy=supervised|balanced and the listed options) to remove
or redefine option (2) so it enforces blocker resolution before
proceeding—either delete the "accept as a known limitation" choice or change it
to a non-proceeding state (e.g., "record as known limitation and put manuscript
on hold") and ensure the text explicitly requires re-running this gate after
resolution; reference release_blocker_count, gate_status,
autonomy=supervised|balanced, DESLOP-FLAGS.md and CRITIQUE-LOG.md when making
the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants