This roadmap turns the PRD into iterative, testable phases. Each phase ships a working vertical slice that can be demoed and verified independently.
- Ship in slices: Every phase produces a working demo artifact.
- Deterministic tests: Use fixtures and mocks to make results reproducible.
- Progressively enhance: Start with heuristics, then add LLM grouping, then polish.
- No scope creep: Follow PRD "Out of Scope".
- Framework: Next.js (App Router), Tailwind CSS
- Diff rendering:
react-diff-view - LLM: AI SDK (
ai) with@ai-sdk/openaiprovider; structured outputs (Zod) - Validation:
zodschemas for all external boundaries - Testing: Vitest + React Testing Library; Playwright for minimal e2e
- State: In-memory/session only; simple TTL cache for PR URL → result
- Feature flags:
process.env.NEXT_PUBLIC_*where client-visible; server-only otherwise
- Goal: Runnable Next.js app with baseline deps and CI checks.
- Scope:
- Init Next.js + Tailwind; add
react-diff-view,ai,@ai-sdk/openai,@ai-sdk/react,zod,cross-fetch,vitest,@testing-library/react,msw,playwright(optional for smoke). - Basic layout shell and placeholder pages.
- Scripts: dev, test, lint, typecheck, e2e.
- Init Next.js + Tailwind; add
- Deliverables:
- App compiles locally; CI runs tests and lint.
- Tests:
- Vitest: sanity render of home page.
- Playwright (optional): open
/and see placeholder text.
- Acceptance:
pnpm devruns;pnpm testpasses; deploys to Vercel with placeholder page.
- Goal: Accept a public GitHub PR URL and validate.
- Scope:
- Input form with URL parsing.
- Server validator to ensure public PR and
.diffendpoint derivation.
- Deliverables:
- Derive
https://github.com/<org>/<repo>/pull/<id>.difffrom input.
- Derive
- Tests:
- Unit: URL parser (good/bad cases).
- Acceptance: Valid PR URL enables “Analyze” and previews derived
.diffURL.
- Goal: Fetch raw
.diffvia server API. - Scope:
-
GET /api/diff?prUrl=…→ fetch.diffwith unauthenticated GitHub request. - Handle rate limits and large diffs (size cap + friendly error).
-
- Deliverables:
- Returns raw unified diff as text; standard error envelope.
- Tests:
- Unit: input validation; 4xx/5xx handling.
- Integration: fixture-backed response using MSW.
- Acceptance: API returns diff for known fixture within <3s locally.
- Goal: Unified diff → normalized
diff_indexper PRD. - Scope:
- Parse files, statuses, languages (by extension), and hunks.
- Generate stable
file_idandhunk_id(<path>#h<seq>).
- Deliverables:
-
diff_index.jsonmatching PRD section 10.
-
- Tests:
- Snapshots on fixtures (single file, rename, binary skip, many hunks).
- Acceptance: Stable IDs and correct hunk headers for fixtures.
- Goal: 2–6 coherent steps using heuristics only.
- Scope:
- Cluster by path prefixes, file types, keywords in headers/paths.
- Output PRD “LLM Output Schema” without model call.
- Deliverables:
-
steps[]with titles/descriptions/objectives anddiff_refs→file_id+hunk_ids.
-
- Tests:
- Zod validation of schema; deterministic grouping for fixtures.
- Acceptance: Typical PR fixture yields 2–6 sensible steps.
- Goal: Replace/augment heuristics with model-backed organization.
- Scope:
- Use
aiwith@ai-sdk/openai; prompt per PRD section 11. - Structured outputs via
generateObject+ Zod schema; auto-retry on schema mismatch. - Feature flag to toggle heuristic-only vs LLM.
- Use
- Deliverables:
-
POST /api/grouptakes{ diffIndex, metadata }and returns PRD-compliant JSON.
-
- Tests:
- Unit: schema validation; retry logic via mocked provider.
- Integration: golden files for prompt → response mapping with deterministic mock.
- Acceptance: With flag on, grouping quality improves; stays within 2–6 steps and references only provided IDs.
- Goal: Render steps and linked diffs.
- Scope:
- Left rail: step list with progress.
- Main panel: title, description, objective;
react-diff-viewfiltered byhunk_ids.
- Deliverables:
- Navigable read-only step view.
- Tests:
- Component tests with fixture data.
- Acceptance: User sees steps and associated diffs; no notes/gamification yet.
- Goal: Progressive disclosure with XP.
- Scope:
- Next/Previous; +10 XP per completed step; pixel progress bar; completion screen.
- Simple in-memory state machine.
- Deliverables:
- XP counter; “Quest Complete”.
- Tests:
- State machine unit tests; progress transition tests.
- Acceptance: Completing all steps shows completion screen and XP total.
- Goal: Notes per step + export at end.
- Scope:
- Textarea per step; export combined notes as markdown/plaintext.
- No persistence beyond session.
- Deliverables:
- Notes UI with export on completion screen.
- Tests:
- Component tests; export includes step titles and notes.
- Acceptance: Notes captured and downloadable; preserved across navigation.
- Goal: Cache PR URL → grouped result; idempotent runs.
- Scope:
- In-memory cache with TTL; key by normalized PR URL.
- Return cached results when available.
- Deliverables:
- Cache module with log counters.
- Tests:
- Unit: hit/miss/eviction; concurrency safety.
- Acceptance: Second run for same PR returns instantly from cache.
- Goal: Keep UI responsive for larger diffs.
- Scope:
- Lazy-load diffs per step; virtualize large hunks.
- Streaming UI affordances (loading placeholders, blinking cursor animation per PRD).
- Deliverables:
- Progressive rendering; smoother interaction on large fixture.
- Tests:
- Manual perf checks; component tests remain deterministic.
- Acceptance: Large fixture remains smooth; initial render <2s locally.
- Goal: Apply 90s Apple-inspired theme.
- Scope:
- Beige/cream palette, pixel borders, dotted grid, “Press Start 2P”.
- Code styles: keywords green, strings orange, comments gray; filename headers.
- Deliverables:
- Themed components and CSS tokens.
- Tests:
- Visual sanity (manual or Storybook/Chromatic optional).
- Acceptance: UI reflects PRD aesthetic and remains readable.
- Goal: Polished demo flow and deploy.
- Scope:
- Env config for OpenAI provider; robust error handling.
- Seed public PR fixtures via quick links.
- README with demo instructions and limitations.
- Deliverables:
- Vercel deployment URL; demo script; troubleshooting.
- Tests:
- Smoke e2e on deployed preview with known PR.
- Acceptance: Judge can paste a public PR URL and complete the quest within ~10s typical.
- 1 depends on 0
- 2 depends on 1
- 3 depends on 2
- 4 depends on 3
- 5 depends on 4
- 6 depends on 4 (read-only) and improves after 5
- 7 depends on 6
- 8 depends on 6
- 9 depends on 5
- 10 depends on 6
- 11 can run in parallel after 6
- 12 depends on 11 and core flows
- Small JS-only change
- Mixed frontend/backend change
- Rename + move
- Many small files
- Doc-only change
- Large PR with 50+ hunks (capped for demo)
OPENAI_API_KEY(server)FEATURE_USE_LLM=true|false(server)NEXT_PUBLIC_APP_NAME=PR QUEST(client)
pnpm dev— run apppnpm test— unit tests (Vitest)pnpm e2e— Playwright smokepnpm lint— lint
- Achievement variants; additional badges
- More sophisticated clustering heuristics (AST-aware where cheap)
- Offline demo mode with embedded fixtures
- Storybook for component review