feat(sdlc): SDLC state persistence layer (#1704)#1813
feat(sdlc): SDLC state persistence layer (#1704)#1813tomgreen981111-cipher wants to merge 1 commit into
Conversation
…tracking (Anantys-oss#1704) Implements instance/sdlc/{issue_name}/ workspace layout with atomic STATE.json reads/writes, phase artifact paths, and workspace archival for terminal phases (PRODUCTION_READY, ABANDONED). Foundation required by Anantys-oss#1707 (/sdlc orchestrator) and Anantys-oss#1706 (approval checkpoint) — both need a place to read and persist phase state between missions.
PR Review — feat(sdlc): SDLC state persistence layer (#1704)Solid foundation for SDLC state persistence with comprehensive test coverage. Three warnings need attention before merge.
🟡 Important1. Path traversal risk in _sanitise_issue_name (`koan/app/sdlc_state.py`, L201-204)The sanitization replaces unsafe chars but doesn't prevent all path traversal. After sanitization, More concerning: the function strips 🟢 Suggestions1. Docstring overstates atomic_write_json guarantees (`koan/app/sdlc_state.py`, L170-172)The docstring claims If the locking is per-process only, concurrent SDLC runs could still corrupt state. Checklist
To rebase specific severity levels, use: Silent Failure Analysis🟡 **MEDIUM** — silent null return on error (`koan/app/sdlc_state.py:108-118`)Risk: Corrupt JSON or read errors are silently converted to None, making it impossible for callers to distinguish between 'not started' and 'data corruption'. Fix: Log the exception before returning None, or raise a custom exception for callers that need to handle corruption differently from 'not started'. 🟡 **MEDIUM** — silent fallback on deserialization failure (`koan/app/sdlc_state.py:127-137`)Risk: Invalid enum values in persisted state are silently replaced with defaults, potentially masking data corruption or version mismatches. Fix: Log a warning when falling back to default values, or surface the unknown value so operators can detect state file drift. 🟡 **MEDIUM** — unvalidated path construction (`koan/app/sdlc_state.py:169-174`)Risk: The artifact_name is not validated against SDLC_ARTIFACTS before use, allowing arbitrary file paths to be constructed (potential path injection if caller is untrusted). Fix: Validate artifact_name against SDLC_ARTIFACTS or use a whitelist check before constructing the path. 🟡 **MEDIUM** — test asserts on non-deterministic value (`koan/tests/test_sdlc_state.py:124-130`)Risk: The test asserts that started_at is non-empty but doesn't validate its format, allowing malformed timestamps to pass. Fix: Add a regex or datetime parse assertion to validate the timestamp format matches ISO 8601. Automated review by Kōan (Ollama-launch · model qwen3.5:cloud) |
BabyKoan
left a comment
There was a problem hiding this comment.
Blocking issues found — see the review comment above.
|
Agree with all three warnings. Most urgent: |
PR Review — feat(sdlc): SDLC state persistence layer (#1704)Well-structured persistence layer with strong test coverage, but
🔴 Blocking1. Unguarded int() cast in from_dict breaks load_sdlc_state's error contract (`koan/app/sdlc_state.py`, L141)
The function is otherwise carefully defensive — enum fields get try/except with fallback. But this single unguarded cast means any STATE.json with
🟡 Important1. Exception handler too narrow for the call it wraps (`koan/app/sdlc_state.py`, L108-118)The except clause catches Two options, either is fine:
Option 2 is cleaner — put resilience where the knowledge lives. 2. get_artifact_path accepts arbitrary filenames (`koan/app/sdlc_state.py`, L169-180)
This is internal code today, but the docstring already mentions the whitelist ("must be one of if artifact_name not in SDLC_ARTIFACTS:
raise ValueError(f"Unknown artifact: {artifact_name!r}")3. Missing test for from_dict with corrupt field typesThe test suite covers unknown enum values and missing fields but doesn't exercise Add a test that passes structurally-valid JSON with wrong value types and assert it either returns a sensible default or Checklist
To rebase specific severity levels, mention me: Silent Failure Analysis🟠 **HIGH** — incomplete exception guard (`koan/app/sdlc_state.py:170-175`)Risk: The docstring promises None for malformed STATE.json, but Fix: Widen the except clause to 🟡 **MEDIUM** — silent fallback on corrupt state (`koan/app/sdlc_state.py:128-135`)Risk: An invalid phase or risk value in STATE.json silently resets to defaults with no logging — a corrupted state file could regress a workflow from IMPLEMENTATION back to RESEARCH without any trace. Fix: Log a warning (via 🟡 **MEDIUM** — silent null return hides corruption (`koan/app/sdlc_state.py:163-175`)Risk: Callers cannot distinguish 'never started' from 'STATE.json exists but is corrupt' — both return None, so a corrupted workflow silently appears as unstarted, potentially causing duplicate SDLC runs on the same issue. Fix: Log a warning when the file exists but can't be parsed, so the silent-restart scenario is at least observable in logs. 🟡 **MEDIUM** — silent name collision (`koan/app/sdlc_state.py:245-252`)Risk: Any issue name composed entirely of non-alphanumeric characters (e.g. Fix: Raise Automated review by Kōan (Claude · model claude-opus-4-6) |
Koan-Bot
left a comment
There was a problem hiding this comment.
Blocking issues found — see the review comment above.
What
Add
sdlc_state.py— the persistence foundation for the /sdlc multi-phase orchestration skill.Why
Issue #1704: without durable cross-mission state, SDLC phases are isolated runs that share zero structured data. A failed mid-implementation run loses all research and architecture work, the orchestrator can't make data-driven routing decisions, and the approval checkpoint (#1706) has nowhere to persist its
approvedflag.This is the prerequisite for #1707 (/sdlc orchestrator skill) and #1706 (human approval checkpoint).
How
SdlcPhasestr-enum (RESEARCH → PRODUCTION_READY / ABANDONED) withis_terminalpropertySdlcRiskLevelenum (Low/Medium/High)SdlcStatedataclass withto_dict()/from_dict()round-trip for all field types (lists, dicts, enums, bool)instance/sdlc/{issue_name}/STATE.json+ named artifact files (RESEARCH.md, ADR.md, PLAN.md, IMPLEMENTATION.md, SECURITY.md, QA.md, SRE.md, REVIEW.md, DOCS.md)save_sdlc_stateusesatomic_write_json(temp + rename +fcntl.flock) — crash mid-write never corrupts existing statearchive_sdlc_workspacemoves terminal workspaces tosdlc/_archived/without clobbering prior archives (timestamp suffix)list_sdlc_workspacesreturns active (non-archived) workspace namesTesting
55 unit tests covering: phase/risk enum round-trips,
from_dictwith unknown/missing fields,get_sdlc_workspaceidempotency and path safety,load_sdlc_stateresilience (missing workspace, corrupt JSON, empty file), save/load cycle with multiple isolated issues, artifact path computation, archive behaviour for terminal vs active phases, and list excluding_archived.Closes #1704