Add source-grounded gpd:ideate MVP#249
Conversation
|
Adam Levine seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdds the gpd:ideate MVP: command/workflow, source normalization, durable blackboard/transcript/report artifacts, three internal agent specs (paper-digester, ideator, critic), templates, tests, and supporting docs/runbooks. Changesgpd:ideate MVP
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
src/gpd/specs/workflows/ideate.md (2)
210-306: 💤 Low valueOptional: Add language specifiers to code fences.
Static analysis flagged code blocks at lines 237 and 272 for missing language specifiers. Adding
textorpythonwould silence the warnings.📝 Optional fix
Line 237:
-``` +```text task(Line 272:
-``` +```text task(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/workflows/ideate.md` around lines 210 - 306, Two fenced code blocks containing the task(...) specs for the ideation and critic spawns (the blocks that use IDEATOR_MODEL and CRITIC_MODEL / subagent_type="gpd-ideator" and subagent_type="gpd-ideation-critic") are missing language specifiers; add a language tag (e.g., ```text) to the opening fence of each of those two blocks so static analysis warnings are silenced and the blocks render consistently.
161-208: 💤 Low valueOptional: Add language specifier to code fence.
Static analysis flagged the code block at line 172 for missing a language specifier. While this is low priority since the block contains structured spawn syntax rather than executable shell, adding
textorpythonwould silence the warning.📝 Optional fix
-``` +```text task(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/workflows/ideate.md` around lines 161 - 208, Add a language specifier to the fenced code block that begins with the spawn/task manifest (the block containing task( ... <spawn_contract> ... ) so the Markdown linter stops flagging it; replace the opening ``` with a language tag such as ```text (or ```ini/```yaml if you prefer a more specific hint) for the block that includes the task( and <spawn_contract> constructs, leaving the block contents unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/gpd/core/ideate_blackboard.py`:
- Around line 78-79: The _TURN_HEADING_RE is too permissive and matches "## Turn
N" inside message bodies; change it to only match top-level markdown turn
headings by requiring a document start or blank line before the heading and a
blank line (or end-of-string) after it — e.g. replace _TURN_HEADING_RE with a
pattern that uses a lookbehind for start-or-newline ((?<=\A|\n)) and a lookahead
for a blank line or end ((?=\n\s*\n|\Z)) around "##[ \t]+Turn[ \t]+(\d+)\b", and
update the other occurrence referenced in the comment (the second
_TURN_HEADING_RE usage) to use the same stricter regex so turn parsing ignores
in-body matches.
- Around line 748-752: The function _append_turn_body currently calls
body.replace("No turns recorded.", "") which can delete that phrase from earlier
turns; instead only remove that placeholder when the entire body equals (after
stripping) "No turns recorded." so historical content is preserved. Change the
logic in _append_turn_body to test if body.strip() == "No turns recorded." and
if so set body = "" (or body = body.rstrip() -> ""), otherwise leave body
unchanged, then proceed to concatenate "\n\n" + rendered_turn.rstrip() and call
_ensure_trailing_newline as before.
In `@src/gpd/core/ideate_sources.py`:
- Around line 247-249: The code is currently re-tokenizing a single-item
Sequence[str] that contains spaces; change the condition so we only call
_argument_tokens when the original input was a single raw string (not an
already-tokenized Sequence[str]) — e.g., detect whether the caller passed a str
vs a Sequence[str] and if the input is an already-provided sequence (like
list/tuple) preserve arguments as-is by returning tuple(arguments) instead of
calling _argument_tokens; update the branch around arguments and the
_argument_tokens call accordingly.
- Around line 395-435: _scan_directory_sources currently calls
directory.iterdir() which can raise OSError/PermissionError and crash
normalization; wrap the iteration in a try/except that catches OSError and
PermissionError, and on exception return a _DirectoryScan with selected=() and
warnings containing a descriptive message that includes the exception text (so
the caller branch that creates a blocked _NormalizedInput will run). Update
_scan_directory_sources (and any place that builds the warning tuple) to include
the directory path (use _relative_posix(directory, root) if helpful) and the
exception details in the warning so callers of _scan_directory_sources and the
caller logic that checks .selected and .warnings (the code that builds the
blocked IdeateSourceRecord) will produce the blocked row with warnings instead
of crashing.
---
Nitpick comments:
In `@src/gpd/specs/workflows/ideate.md`:
- Around line 210-306: Two fenced code blocks containing the task(...) specs for
the ideation and critic spawns (the blocks that use IDEATOR_MODEL and
CRITIC_MODEL / subagent_type="gpd-ideator" and
subagent_type="gpd-ideation-critic") are missing language specifiers; add a
language tag (e.g., ```text) to the opening fence of each of those two blocks so
static analysis warnings are silenced and the blocks render consistently.
- Around line 161-208: Add a language specifier to the fenced code block that
begins with the spawn/task manifest (the block containing task( ...
<spawn_contract> ... ) so the Markdown linter stops flagging it; replace the
opening ``` with a language tag such as ```text (or ```ini/```yaml if you prefer
a more specific hint) for the block that includes the task( and <spawn_contract>
constructs, leaving the block contents unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d347a32f-d902-4ff8-95b1-f3c32ee1d575
📒 Files selected for processing (31)
CHANGELOG.mddocs/dev/gpd-ideate-local-demo.mddocs/dev/gpd-ideate-mvp-plan.mdsrc/gpd/agents/gpd-ideation-critic.mdsrc/gpd/agents/gpd-ideator.mdsrc/gpd/agents/gpd-paper-digester.mdsrc/gpd/commands/ideate.mdsrc/gpd/core/config.pysrc/gpd/core/constants.pysrc/gpd/core/ideate_blackboard.pysrc/gpd/core/ideate_sources.pysrc/gpd/registry.pysrc/gpd/specs/references/help/detailed-command-reference.mdsrc/gpd/specs/references/orchestration/model-profiles.mdsrc/gpd/specs/templates/ideate-blackboard.mdsrc/gpd/specs/templates/ideate-transcript.mdsrc/gpd/specs/templates/ideation-report.mdsrc/gpd/specs/workflows/help.mdsrc/gpd/specs/workflows/ideate.mdsrc/gpd/specs/workflows/set-profile.mdtests/README.mdtests/core/test_agent_prompt_budget.pytests/core/test_agent_spawn_policy.pytests/core/test_command_prompt_budget.pytests/core/test_config.pytests/core/test_ideate_blackboard.pytests/core/test_ideate_prompt_contract.pytests/core/test_ideate_sources.pytests/core/test_spawn_contract_inventory.pytests/repo_graph_contract.jsontests/test_metadata_consistency.py
| _TURN_HEADING_RE = re.compile(r"(?m)^##[ \t]+Turn[ \t]+(\d+)\b.*$") | ||
|
|
There was a problem hiding this comment.
Turn-number parsing is too broad and can reject valid updates.
The current turn regex matches any ## Turn N line anywhere in the transcript body, including user/model message content. That can trigger false non-contiguous-turn failures during append.
Proposed fix
-_TURN_HEADING_RE = re.compile(r"(?m)^##[ \t]+Turn[ \t]+(\d+)\b.*$")
+_TURN_HEADING_RE = re.compile(r"(?m)^##[ \t]+Turn[ \t]+(\d+)[ \t]*$(?=\n\nMetadata:\n)")Also applies to: 755-756
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/ideate_blackboard.py` around lines 78 - 79, The _TURN_HEADING_RE
is too permissive and matches "## Turn N" inside message bodies; change it to
only match top-level markdown turn headings by requiring a document start or
blank line before the heading and a blank line (or end-of-string) after it —
e.g. replace _TURN_HEADING_RE with a pattern that uses a lookbehind for
start-or-newline ((?<=\A|\n)) and a lookahead for a blank line or end
((?=\n\s*\n|\Z)) around "##[ \t]+Turn[ \t]+(\d+)\b", and update the other
occurrence referenced in the comment (the second _TURN_HEADING_RE usage) to use
the same stricter regex so turn parsing ignores in-body matches.
| def _append_turn_body(body: str, rendered_turn: str) -> str: | ||
| body = body.rstrip() | ||
| if "No turns recorded." in body: | ||
| body = body.replace("No turns recorded.", "").rstrip() | ||
| return _ensure_trailing_newline(body + "\n\n" + rendered_turn.rstrip()) |
There was a problem hiding this comment.
Appending a new turn can silently mutate prior turn content.
_append_turn_body removes "No turns recorded." globally. If earlier turn text contains that phrase, it gets deleted on later appends, which rewrites historical transcript content.
Proposed fix
def _append_turn_body(body: str, rendered_turn: str) -> str:
body = body.rstrip()
- if "No turns recorded." in body:
- body = body.replace("No turns recorded.", "").rstrip()
+ body = re.sub(
+ r"(?ms)(^##[ \t]+Turns[ \t]*\n\n)No turns recorded\.[ \t]*\n?",
+ r"\1",
+ body,
+ count=1,
+ ).rstrip()
return _ensure_trailing_newline(body + "\n\n" + rendered_turn.rstrip())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/ideate_blackboard.py` around lines 748 - 752, The function
_append_turn_body currently calls body.replace("No turns recorded.", "") which
can delete that phrase from earlier turns; instead only remove that placeholder
when the entire body equals (after stripping) "No turns recorded." so historical
content is preserved. Change the logic in _append_turn_body to test if
body.strip() == "No turns recorded." and if so set body = "" (or body =
body.rstrip() -> ""), otherwise leave body unchanged, then proceed to
concatenate "\n\n" + rendered_turn.rstrip() and call _ensure_trailing_newline as
before.
| if len(arguments) == 1 and isinstance(arguments[0], str) and " " in arguments[0]: | ||
| return _argument_tokens(arguments[0]) | ||
| return tuple(str(item) for item in arguments) |
There was a problem hiding this comment.
Preserve already-tokenized Sequence[str] inputs.
At Line 247, a single sequence item containing spaces is re-tokenized, which can break valid one-argument inputs (like quoted file paths with spaces) passed as Sequence[str].
Proposed fix
def _argument_tokens(arguments: str | Sequence[str] | None) -> tuple[str, ...]:
if arguments is None:
return ()
if isinstance(arguments, str):
try:
return tuple(shlex.split(arguments))
except ValueError:
return tuple(part for part in arguments.split() if part)
- if len(arguments) == 1 and isinstance(arguments[0], str) and " " in arguments[0]:
- return _argument_tokens(arguments[0])
return tuple(str(item) for item in arguments)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if len(arguments) == 1 and isinstance(arguments[0], str) and " " in arguments[0]: | |
| return _argument_tokens(arguments[0]) | |
| return tuple(str(item) for item in arguments) | |
| def _argument_tokens(arguments: str | Sequence[str] | None) -> tuple[str, ...]: | |
| if arguments is None: | |
| return () | |
| if isinstance(arguments, str): | |
| try: | |
| return tuple(shlex.split(arguments)) | |
| except ValueError: | |
| return tuple(part for part in arguments.split() if part) | |
| return tuple(str(item) for item in arguments) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/ideate_sources.py` around lines 247 - 249, The code is currently
re-tokenizing a single-item Sequence[str] that contains spaces; change the
condition so we only call _argument_tokens when the original input was a single
raw string (not an already-tokenized Sequence[str]) — e.g., detect whether the
caller passed a str vs a Sequence[str] and if the input is an already-provided
sequence (like list/tuple) preserve arguments as-is by returning
tuple(arguments) instead of calling _argument_tokens; update the branch around
arguments and the _argument_tokens call accordingly.
| scan = _scan_directory_sources(directory, root=root, max_papers=max_papers) | ||
| if not scan.selected: | ||
| return _NormalizedInput( | ||
| records=( | ||
| IdeateSourceRecord( | ||
| source_id="", | ||
| kind="blocked", | ||
| input=raw, | ||
| normalized_ref=_relative_posix(directory, root), | ||
| status="blocked", | ||
| warnings=("Directory contains no supported .tex or .pdf paper sources.", *scan.warnings), | ||
| ), | ||
| ), | ||
| warnings=scan.warnings, | ||
| ) | ||
|
|
||
| records = [ | ||
| _normalize_paper_path( | ||
| _relative_posix(path, root), | ||
| path, | ||
| suffix=path.suffix.lower(), | ||
| root=root, | ||
| kdoc_index=kdoc_index, | ||
| kind="directory_item", | ||
| ) | ||
| for path in scan.selected | ||
| ] | ||
| if scan.warnings: | ||
| records[0] = _with_warnings(records[0], scan.warnings) | ||
| return _NormalizedInput(records=tuple(records), warnings=scan.warnings) | ||
|
|
||
|
|
||
| def _scan_directory_sources(directory: Path, *, root: Path, max_papers: int | None) -> _DirectoryScan: | ||
| paper_candidates = [ | ||
| path | ||
| for path in sorted(directory.iterdir(), key=lambda item: item.name.casefold()) | ||
| if path.is_file() | ||
| and not path.name.startswith(".") | ||
| and path.name.lower() not in _IGNORED_DIRECTORY_FILENAMES | ||
| and path.suffix.lower() in _SUPPORTED_PAPER_SUFFIXES | ||
| ] |
There was a problem hiding this comment.
Handle unreadable directories without crashing source normalization.
At Line 430, directory.iterdir() can raise OSError/PermissionError for valid-but-unreadable paths, which currently aborts normalization instead of returning a blocked row with warnings.
Proposed fix
def _normalize_directory(
@@
) -> _NormalizedInput:
scan = _scan_directory_sources(directory, root=root, max_papers=max_papers)
if not scan.selected:
+ base_warning: tuple[str, ...] = ()
+ if not scan.warnings:
+ base_warning = ("Directory contains no supported .tex or .pdf paper sources.",)
return _NormalizedInput(
records=(
IdeateSourceRecord(
source_id="",
kind="blocked",
input=raw,
normalized_ref=_relative_posix(directory, root),
status="blocked",
- warnings=("Directory contains no supported .tex or .pdf paper sources.", *scan.warnings),
+ warnings=(*base_warning, *scan.warnings),
),
),
warnings=scan.warnings,
)
@@
def _scan_directory_sources(directory: Path, *, root: Path, max_papers: int | None) -> _DirectoryScan:
+ try:
+ entries = sorted(directory.iterdir(), key=lambda item: item.name.casefold())
+ except OSError as exc:
+ return _DirectoryScan(
+ selected=(),
+ warnings=(f"Unable to read ideate source directory {_relative_posix(directory, root)}: {exc}",),
+ )
paper_candidates = [
path
- for path in sorted(directory.iterdir(), key=lambda item: item.name.casefold())
+ for path in entries
if path.is_file()
and not path.name.startswith(".")
and path.name.lower() not in _IGNORED_DIRECTORY_FILENAMES
and path.suffix.lower() in _SUPPORTED_PAPER_SUFFIXES
]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/gpd/core/ideate_sources.py` around lines 395 - 435,
_scan_directory_sources currently calls directory.iterdir() which can raise
OSError/PermissionError and crash normalization; wrap the iteration in a
try/except that catches OSError and PermissionError, and on exception return a
_DirectoryScan with selected=() and warnings containing a descriptive message
that includes the exception text (so the caller branch that creates a blocked
_NormalizedInput will run). Update _scan_directory_sources (and any place that
builds the warning tuple) to include the directory path (use
_relative_posix(directory, root) if helpful) and the exception details in the
warning so callers of _scan_directory_sources and the caller logic that checks
.selected and .warnings (the code that builds the blocked IdeateSourceRecord)
will produce the blocked row with warnings instead of crashing.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/gpd/specs/workflows/ideate.md (1)
148-181: 💤 Low valueAdd language identifiers to code fences for markdown consistency.
The code fences at lines 148 and 208 are missing language identifiers. While these blocks show template/pseudo-code patterns rather than executable code, adding an identifier (e.g.,
textoryaml) would satisfy markdown linting rules and improve consistency.📝 Suggested language identifier addition
For line 148:
-``` +```text task( subagent_type="gpd-paper-digester",For line 208:
-``` +```text task( subagent_type="gpd-ideator",Also applies to: 208-241
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/gpd/specs/workflows/ideate.md` around lines 148 - 181, Add language identifiers to the fenced code blocks wrapping the task definitions to satisfy markdown linting: locate the task(...) block that begins with subagent_type="gpd-paper-digester" (the first task block) and the later task(...) block with subagent_type="gpd-ideator", and change their opening fences from ``` to ```text (or ```yaml if you prefer structured rendering) so both code fences declare a language identifier for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/gpd/specs/workflows/ideate.md`:
- Around line 148-181: Add language identifiers to the fenced code blocks
wrapping the task definitions to satisfy markdown linting: locate the task(...)
block that begins with subagent_type="gpd-paper-digester" (the first task block)
and the later task(...) block with subagent_type="gpd-ideator", and change their
opening fences from ``` to ```text (or ```yaml if you prefer structured
rendering) so both code fences declare a language identifier for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 2c8327e9-79cd-4189-bb53-89e25bd711f4
📒 Files selected for processing (14)
src/gpd/agents/gpd-ideation-critic.mdsrc/gpd/agents/gpd-ideator.mdsrc/gpd/agents/gpd-paper-digester.mdsrc/gpd/commands/ideate.mdsrc/gpd/core/return_fields.pysrc/gpd/specs/workflows/ideate.mdtests/adapters/test_runtime_projection_diagnostics_budget.pytests/core/test_agent_prompt_budget.pytests/core/test_cli.pytests/core/test_command_prompt_budget.pytests/core/test_ideate_blackboard.pytests/core/test_ideate_prompt_contract.pytests/core/test_ideate_sources.pytests/core/test_prompt_surface_diagnostics_budget.py
✅ Files skipped from review due to trivial changes (1)
- src/gpd/commands/ideate.md
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/core/test_command_prompt_budget.py
- tests/core/test_agent_prompt_budget.py
- tests/core/test_ideate_prompt_contract.py
- tests/core/test_ideate_sources.py
- src/gpd/agents/gpd-ideator.md
- tests/core/test_ideate_blackboard.py
|
🤖 RoastBot: "Source-grounded ideation." That's just reading, Adam. |
Summary
Verification
Demo
Summary by CodeRabbit
New Features
Documentation
Tests
Chores