feat: LLM-driven mission decomposition with /decompose skill#975
feat: LLM-driven mission decomposition with /decompose skill#975Koan-Bot wants to merge 3 commits into
Conversation
|
@Koan-Bot review |
PR Review — feat: LLM-driven mission decomposition with /decompose skillSolid feature with good fail-safe design and thorough tests. However, there is one critical bug: the 🔴 Blocking1. Fallback only inserts the last subtask, silently drops the rest ( This should loop over all subtask_lines and insert each one, or — better — join them all and insert as a block. As written, a decomposition into 4 sub-tasks where the parent line doesn't match exactly will result in only 1 sub-task being queued. Suggested fix: # Parent not found — append all sub-tasks at bottom of pending
for line in subtask_lines:
content = insert_mission(content, line, urgent=False)
return content🟡 Important1. Greedy regex can match across multiple JSON objects ( Use a non-greedy match json_match = re.search(r'\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}', output)Or try non-greedy first, then greedy as fallback. 2. 8 debug print() statements should use structured logging ( Since
This is the issue the quality report flagged — all 8 occurrences. 3. time.time() in group_id makes decomposition non-deterministic ( group_id = hashlib.sha256(
f"{mission_project}:{mission_title}".encode()
).hexdigest()[:8]4. Decomposition skip returns action='autonomous' which may trigger exploration ( A dedicated action like 5. Nested JSON objects will break the greedy regex + json.loads ( 🟢 Suggestions1. stdlib imports inside function body ( 2. Single-project case silently falls through with no project tag ( 3. os.environ.setdefault at import time is fragile ( Checklist
SummarySolid feature with good fail-safe design and thorough tests. However, there is one critical bug: the Automated review by Kōan |
|
Here's the review. One critical issue: |
d2c6cd0 to
e3d13a3
Compare
review feedback:**
- **Fixed critical bug in `inject_subtasks` fallback** (`koan/app/missions.py:535-537`): When the parent mission line can't be found, the fallback now loops over all `subtask_lines` and inserts each one via `insert_mission()`, instead of only inserting `subtask_lines[-1]` which silently dropped all but the last sub-task.
- **Fixed greedy JSON regex** (`koan/app/decompose.py:113`): Replaced `r'\{.*\}'` (greedy with `re.DOTALL`) with `r'\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}'` which correctly handles one level of nested braces without greedily matching across unrelated brace groups in surrounding text.
- **Replaced 8 `print()` statements with structured logging** (`koan/app/decompose.py`): Replaced all `print(..., file=sys.stderr)` calls with `logging.getLogger(__name__)` using appropriate log levels (`logger.error` for failures, `logger.warning` for parse issues, `logger.info` for informational caps), consistent with the codebase pattern used in other `koan/app/` modules.
Rebase: feat: LLM-driven mission decomposition with /decompose skillBranch Diff: 7 files changed, 768 insertions(+) Review feedback was analyzed and applied. Changesreview feedback:**
Actions
CICI passed. Automated by Kōan |
|
We need to add some user facing documentation and example usage so we can better understand as a user how to use that feature. |
|
@Koan-Bot rebase |
Rebase with requested adjustmentsBranch StatsActions performed
CI statusCI will be checked asynchronously. Automated by Kōan |
review feedback:**
- **Fixed critical bug in `inject_subtasks` fallback** (`koan/app/missions.py:535-537`): When the parent mission line can't be found, the fallback now loops over all `subtask_lines` and inserts each one via `insert_mission()`, instead of only inserting `subtask_lines[-1]` which silently dropped all but the last sub-task.
- **Fixed greedy JSON regex** (`koan/app/decompose.py:113`): Replaced `r'\{.*\}'` (greedy with `re.DOTALL`) with `r'\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}'` which correctly handles one level of nested braces without greedily matching across unrelated brace groups in surrounding text.
- **Replaced 8 `print()` statements with structured logging** (`koan/app/decompose.py`): Replaced all `print(..., file=sys.stderr)` calls with `logging.getLogger(__name__)` using appropriate log levels (`logger.error` for failures, `logger.warning` for parse issues, `logger.info` for informational caps), consistent with the codebase pattern used in other `koan/app/` modules.
e3d13a3 to
2596e2c
Compare
|
@Koan-Bot need |
|
@Koan-Bot rr |
Agreed — this is also a hard requirement per the project's own conventions in CLAUDE.md. Specifically, three documentation updates are needed before merge:
The conventions state: "undocumented features are invisible to users." |
PR Review — feat: LLM-driven mission decomposition with /decompose skillThe decomposition feature is well-designed overall — opt-in by default, fail-safe pattern, clean separation between classifier/injector/skill. However, there is one critical bug: the fallback path in 🔴 Blocking1. Fallback path creates infinite decomposition loop (`koan/app/missions.py`, L535-539)When the parent line cannot be matched (either verbatim or without timestamps), the fallback appends all subtasks to Pending but never removes the parent. Since the parent still carries its The safe fix is to not inject subtasks when the parent can't be located, log a warning, and return the content unchanged so the mission runs atomically as a fail-safe: # Parent not found — fall through to atomic execution
logger.warning("inject_subtasks: parent line not found — skipping injection")
return contentAlternatively, if you want the fallback to still inject, you must also remove or mark the parent (e.g., move it to Done with a 🟡 Important1. lstrip('- ') is character-based, not prefix-based (`koan/app/missions.py`, L530)
line_bare = ts_pattern.sub("", line.strip().removeprefix("- ")).strip()2. Inline imports of re, hashlib, time inside _maybe_decompose (`koan/app/iteration_manager.py`, L850-854)Three stdlib modules ( 3. Missing required documentation updates per project conventionsPer CLAUDE.md conventions for adding a new core skill, this PR is missing:
All three are required before merge per project conventions ("undocumented features are invisible to users"). 🟢 Suggestions1. Missing test coverage for integration and edge casesThe test suite covers
2. Inline import of re inside function body (`koan/skills/core/decompose/handler.py`, L59)
Checklist
SummaryThe decomposition feature is well-designed overall — opt-in by default, fail-safe pattern, clean separation between classifier/injector/skill. However, there is one critical bug: the fallback path in To rebase specific severity levels, mention me: Silent Failure Analysis🔴 **CRITICAL** — silent fallback causes infinite loop (`koan/app/missions.py:533-535`)Risk: When the parent mission line cannot be located, subtasks are appended but the parent is NOT removed — the caller (_maybe_decompose) returns (None, None) signaling success, so next iteration the unchanged parent is picked again, decomposed again, and more duplicate subtasks are appended, creating an unbounded loop. Fix: Log a warning and return the content unchanged (no subtask injection) so the caller falls back to running the parent as-is, or raise an exception so the caller knows injection failed. 🟠 **HIGH** — error conflated with success path (`koan/app/iteration_manager.py:841-846`)Risk: decompose_mission returns None for both 'atomic' and all error paths (CLI timeout, import failure, parse error); the caller strips the [decompose] tag and logs 'classified as atomic', permanently losing the signal that decomposition was requested but failed — no retry is possible and the operator sees a false-positive success message. Fix: Have decompose_mission raise or return a distinct sentinel for errors (e.g. a (classification, subtasks) tuple or a custom result object) so the caller can preserve the [decompose] tag on failure and log accurately. 🟡 **MEDIUM** — silent degradation on lookup failure (`koan/app/iteration_manager.py:830-836`)Risk: If the project path lookup fails or the project name isn't found, project_path stays as empty string and the LLM classifier runs with no project context (arbitrary CWD), potentially producing wrong atomic/composite classification without any signal to the operator that context was missing. Fix: When project_path is empty after lookup, either skip decomposition and let the mission run as-is, or pass a flag to decompose_mission indicating missing context so it can adjust its behavior. 🟡 **MEDIUM** — catch-all exception swallowed (`koan/app/decompose.py:80-84`)Risk: A bare 'except Exception' around the CLI call catches unexpected errors (RuntimeError, TypeError, etc.) and converts them to None (treated as 'atomic'), masking bugs in run_cli_with_retry or command building that should surface as crashes during development. Fix: Narrow the catch to expected exceptions (OSError, subprocess.TimeoutExpired, subprocess.SubprocessError) and let unexpected errors propagate. |
Adds opt-in task decomposition: missions tagged [decompose] (or with
auto-decompose enabled in config) are classified by a lightweight Claude
call before execution. Composite missions are split into focused sub-tasks
injected back into the Pending queue; atomic missions run as-is.
Changes:
- koan/system-prompts/decompose-mission.md: classifier prompt with
anti-over-decomposition bias ("when in doubt, choose atomic"), JSON
output schema, max 6 sub-tasks cap
- koan/app/decompose.py: decompose_mission(), should_decompose(),
is_already_decomposed() — mirrors pr_review_learning.py LLM call pattern
- koan/app/missions.py: inject_subtasks() — replaces parent mission line
with ordered sub-task lines tagged [group:ID], inheriting [project:X] tag
- koan/app/iteration_manager.py: _maybe_decompose() hook after mission
pick (Step 4b); returns (None, None) to skip parent when sub-tasks injected
- koan/skills/core/decompose/: /decompose Telegram skill that queues
missions with [decompose] tag; aliased as /split
- koan/tests/test_decompose.py: 31 tests covering parse, tag detection,
CLI mocking, and inject_subtasks mutations
Closes Anantys-oss#739
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
review feedback:**
- **Fixed critical bug in `inject_subtasks` fallback** (`koan/app/missions.py:535-537`): When the parent mission line can't be found, the fallback now loops over all `subtask_lines` and inserts each one via `insert_mission()`, instead of only inserting `subtask_lines[-1]` which silently dropped all but the last sub-task.
- **Fixed greedy JSON regex** (`koan/app/decompose.py:113`): Replaced `r'\{.*\}'` (greedy with `re.DOTALL`) with `r'\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}'` which correctly handles one level of nested braces without greedily matching across unrelated brace groups in surrounding text.
- **Replaced 8 `print()` statements with structured logging** (`koan/app/decompose.py`): Replaced all `print(..., file=sys.stderr)` calls with `logging.getLogger(__name__)` using appropriate log levels (`logger.error` for failures, `logger.warning` for parse issues, `logger.info` for informational caps), consistent with the codebase pattern used in other `koan/app/` modules.
Rebase with requested adjustmentsBranch Changes applied
StatsActions performed
CI statusCI will be checked asynchronously. Automated by Kōan |
2596e2c to
dd33b93
Compare
|
superseded by #1533 |
Summary
Adds opt-in LLM-driven task decomposition for complex missions. When a mission is tagged
[decompose](orauto: truein config), a lightweight Claude call classifies it as atomic or composite before execution. Composite missions are automatically split into focused, ordered sub-tasks injected into the Pending queue — each with a[group:ID]tag for lineage tracking.Fixes #739
Changes
koan/system-prompts/decompose-mission.md: Classifier prompt with JSON output schema, anti-over-decomposition bias ("when in doubt, choose atomic"), and a 6 sub-task capkoan/app/decompose.py:decompose_mission(),should_decompose(),is_already_decomposed()— mirrorspr_review_learning.pypattern for lightweight LLM callskoan/app/missions.py:inject_subtasks()— replaces parent mission line with ordered sub-task lines tagged[group:ID], inheriting[project:X]tagkoan/app/iteration_manager.py:_maybe_decompose()hook at Step 4b (after mission pick, before execution); returns(None, None)to skip parent when sub-tasks are injectedkoan/skills/core/decompose/:/decomposeTelegram skill (aliased/split) that queues missions with[decompose]tag for automatic classificationkoan/tests/test_decompose.py: 31 tests covering JSON parsing, tag detection, CLI mocking, andinject_subtasks()mutationsDesign decisions
[decompose]unlessdecompose.auto: truein config/are never decomposed (skill-dispatch bypasses classification)[group:*]and[decomposed:*]tags prevent sub-tasks from being re-decomposedTest plan
KOAN_ROOT=/tmp/test-koan .venv/bin/pytest koan/tests/test_decompose.py -v— 31 tests passKOAN_ROOT=/tmp/test-koan .venv/bin/pytest koan/tests/ -q— 10,324 tests pass, no regressions[decompose]tag to a test mission, watch iteration_manager inject sub-tasks🤖 Generated by Kōan /fix
Quality Report
Changes: 7 files changed, 768 insertions(+)
Code scan: 8 issue(s) found
koan/app/decompose.py:60— debug print statementkoan/app/decompose.py:66— debug print statementkoan/app/decompose.py:90— debug print statementkoan/app/decompose.py:97— debug print statementkoan/app/decompose.py:115— debug print statementkoan/app/decompose.py:121— debug print statementkoan/app/decompose.py:130— debug print statementkoan/app/decompose.py:139— debug print statementTests: passed (10 PASSED)
Branch hygiene: clean
Generated by Kōan post-mission quality pipeline