refactor(calendar): unify syllabus extraction onto agent (refactor #4) by Jose-Gael-Cruz-Lopez · Pull Request #79 · SaplingLearn/Sapling

Jose-Gael-Cruz-Lopez · 2026-05-05T04:18:32Z

Summary

Closes the four-refactor migration plan from ADR 0001. The agent (syllabus_extraction_agent) already shipped with refactor #1 and is used by routes/documents.py's agentic upload. This PR migrates the second consumer — services/calendar_service.py::parse_syllabus — onto the same agent, removing the duplicated parsing path that ADR 0005 called out as refactor #4 ("low-effort cleanup that fits anywhere").

This is the last named refactor in the migration plan.

What shipped

backend/agents/tools/syllabus_adapter.py — syllabus_to_wire_dict adapter. Maps the agent's date | None → ISO-8601 strings, defaults assignment_type="other", passes course_title and grading_categories through as additive keys. Pure logic, 10 tests.
backend/services/calendar_service.py — _extract_via_agent (new), parse_syllabus (preserved as legacy fallback per ADR 0001), extract_assignments_from_file rewritten agent-first with fallback on UsageLimitExceeded / UnexpectedModelBehavior / Exception. Pure dedup helpers untouched.
Tests — test_syllabus_adapter.py (10), TestExtractAssignmentsViaAgent (4) + invariant test in test_ocr_pipeline.py, TestImportExtractWireFormat (3) in test_calendar_routes.py. 596 pass on this branch; 3 pre-existing live-Supabase failures unchanged from PR re-architecture: agentic document upload + AES-256-GCM column encryption + dev-context vault #67/refactor(quiz): convert generate_quiz to quiz_agent (refactor #2) #71/refactor(learn): convert chat tutor to chat_tutor_agent (refactor #3) #78. No regressions.
backend/tests/evals/syllabus_extraction.py — three new evaluators (WireFormatRequiredKeysEvaluator, AssignmentTypeNonNullEvaluator, DueDateIsoStringEvaluator) wired into make_dataset(). Run on existing recorded cassettes; no re-record needed.
docs/decisions/0016-refactor-4-syllabus-unification-shipped.md — full ADR with surprises, consequences, and explicit rollback path.
backend/prompts/refactor-4-syllabus-unification/ — sub-agent playbook (8 files) used to dispatch this refactor. Committed for archive.

Prompt hash

97946a2b84b2 — unchanged from refactor #1. The adapter handled the contract translation entirely; no schema move means existing eval cassettes stay valid.

What's NOT in this PR (per ADR 0001 migration contract)

parse_syllabus and prompts/syllabus_extraction.txt are NOT deleted. They stay as the fallback target. Sub-agent E's cleanup PR removes them after this PR is on main for ~1 week. Same playbook PR refactor(quiz): convert generate_quiz to quiz_agent (refactor #2) #71 and PR refactor(learn): convert chat tutor to chat_tutor_agent (refactor #3) #78 followed.
process_and_save_syllabus is NOT deleted. Architecture.md flagged it as "exists for direct OCR→Gemini→DB use but is not currently wired"; only tests/test_ocr_pipeline.py's pre-existing-failing tests call it. Cleanup PR handles deletion if no consumer surfaces.
services/gemini_service.py is NOT deleted. Still alive as the chat fallback target (PR refactor(learn): convert chat tutor to chat_tutor_agent (refactor #3) #78) and the quiz fallback target (PR refactor(quiz): convert generate_quiz to quiz_agent (refactor #2) #71). After this PR plus Sub-agent E's cleanup, call_gemini_json has just one remaining caller (the quiz fallback). The eventual one-line gemini_service.py deletion PR ships once both fallbacks retire.

Migration plan complete

After this PR merges, all four named refactors from ADR 0001 are on main:

#	What	PR	ADR
1	Document upload (agentic orchestrator)	#67	0010, etc.
2	Quiz generation	#71	0013
2-iter	Adaptive quiz iteration	#77	0014
3	Chat tutor	#78	0015
4	Syllabus unification	(this PR)	0016

The remaining migration cleanup is three small PRs: Sub-agent E's syllabus deletion, PR #78's start_session/action follow-up, and the final services/gemini_service.py deletion once both fallbacks retire.

References

ADR 0001 — Adopt Pydantic AI (the migration plan).
ADR 0005 — Refactor Refine LLM Model selection for each function #2 quiz generation (named syllabus as Add landing page with liquid glass effects #4).
ADR 0013 — Refactor Refine LLM Model selection for each function #2 (quiz_agent) shipped.
ADR 0015 — Refactor Fix the learning loop for the context #3 (chat_tutor) shipped (PR refactor(learn): convert chat tutor to chat_tutor_agent (refactor #3) #78).
ADR 0016 — Refactor Add landing page with liquid glass effects #4 (syllabus unification) shipped (this PR).

Test plan

pytest tests/test_syllabus_adapter.py -q → 10 passed
pytest tests/test_ocr_pipeline.py -q → all green except the 3 pre-existing live-DB failures (unchanged)
pytest tests/test_calendar_routes.py -q → all green
pytest tests/ -q --ignore=tests/evals → 596 passed, 3 pre-existing failures (unchanged)
Live-mode eval check (SAPLING_EVAL_MODE=live pytest tests/evals/syllabus_extraction.py -q) — recommended; the three new evaluators run on existing cassettes so replay-mode passes already
Manual: import a real syllabus through /api/calendar/import-extract and confirm assignments come back with the expected shape (assignments + warnings + raw_text + the new course_title/grading_categories extras)

Rollback

Single revert of the merge commit drops the adapter + the new helper, reverts extract_assignments_from_file to its pre-refactor body, and the prompt + agent + the documents-upload path are unchanged (refactor #1 already shipped them). The new test files are pure additions — harmless if rolled back partially.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Syllabus extraction now powered by an agent-based approach with intelligent fallback to legacy extraction.
- Added support for capturing course title and grading categories from syllabus files.
Refactor
- Backend extraction logic refactored for improved reliability and consistency in handling syllabus documents.

Closes the four-refactor migration plan from ADR 0001. The agent (syllabus_extraction_agent) already shipped with refactor #1 and is used by routes/documents.py's agentic upload. This PR migrates the second consumer — services/calendar_service.py::parse_syllabus — onto the same agent, removing the duplicated parsing path that ADR 0005 called out as refactor #4 ("low-effort cleanup that fits anywhere"). What shipped: - agents/tools/syllabus_adapter.py — syllabus_to_wire_dict adapter. Maps the agent's date|None to ISO-8601 strings, defaults assignment_type to "other", passes course_title and grading_categories through as additive keys. Pure logic, 10 tests. - services/calendar_service.py — _extract_via_agent (new), parse_syllabus (preserved as legacy fallback per ADR 0001), extract_assignments_from_file rewritten agent-first with fallback on UsageLimitExceeded / UnexpectedModelBehavior / Exception. Pure dedup helpers (save_assignments_to_db, insert_new_assignments, load_existing_assignment_keys, assignment_dedupe_key) untouched. - tests/test_syllabus_adapter.py — 10 new tests. Pins legacy-shape contract, "other" default, ISO-8601 date serialization, mutable-default isolation, dedupe-key integration. - tests/test_ocr_pipeline.py — TestExtractAssignmentsViaAgent class with 4 new tests covering agent-success + 3 fallback triggers + the empty-text short-circuit. Plus test_agent_and_legacy_paths_share_required_keys pinning the superset-invariant on both paths. - tests/test_calendar_routes.py — TestImportExtractWireFormat with 3 regression tests confirming routes/calendar.py::extract tolerates both agent-path and legacy-path dict shapes. - tests/evals/syllabus_extraction.py — three new evaluators: WireFormatRequiredKeysEvaluator, AssignmentTypeNonNullEvaluator, DueDateIsoStringEvaluator. Run on existing recorded cassettes; no new cases needed. - prompts/refactor-4-syllabus-unification/ — sub-agent playbook (8 files) used to dispatch this refactor. Committed for archive. - docs/decisions/0016-refactor-4-syllabus-unification-shipped.md — ADR with surprises, consequences, rollback. Notes that this is the LAST named refactor in ADR 0001's migration plan. No prompt-hash bump (97946a2b84b2 unchanged from refactor #1) — the adapter handled the contract translation entirely. Existing eval cassettes stay valid. 596 tests pass on this branch; 3 pre-existing live-Supabase failures unchanged (test_skips_self_edges, test_save_to_db, test_full_pipeline). No regressions. Migration plan from ADR 0001 is COMPLETE after this PR merges. The remaining cleanup is three small PRs: gemini_service.py deletion, start_session/action migration on chat (PR #78 follow-up), and sub-agent E's deletion of legacy parse_syllabus + the prompt file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-05T04:18:38Z

Warning

Rate limit exceeded

@Jose-Gael-Cruz-Lopez has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 41 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 28c2d07f-26e8-4174-b066-f84c2e1eb56d

📥 Commits

Reviewing files that changed from the base of the PR and between d2e7020 and 7ca027a.

📒 Files selected for processing (1)

backend/services/calendar_service.py

📝 Walkthrough

Walkthrough

This PR implements Refactor #4 (syllabus extraction unification) by introducing a typed agent-based syllabus parsing path with legacy fallback. It adds a wire-format adapter, converts the calendar service to async agent-first extraction with guardrail-triggered fallback, updates the calendar route to pass request IDs, extends test coverage with unit and regression tests, introduces wire-format validation evaluators, and provides comprehensive ADR and orchestration documentation.

Changes

Syllabus Extraction Unification

Layer / File(s)	Summary
Wire-Format Adapter `backend/agents/tools/syllabus_adapter.py`	New adapter `syllabus_to_wire_dict()` converts `SyllabusAssignments` to legacy dict shape, mapping `due_date` to ISO-8601 strings, defaulting `assignment_type` to `"other"`, and converting `grading_categories` to `{name, weight}` objects.
Agent-First Service Extraction `backend/services/calendar_service.py`	`extract_assignments_from_file` becomes async and now runs `syllabus_extraction_agent` via new `_extract_via_agent` helper; falls back to legacy `parse_syllabus` on `UsageLimitExceeded`, `UnexpectedModelBehavior`, or generic exceptions while ensuring required wire-format keys. `process_and_save_syllabus` updated to await async extraction and pass `user_id`.
Route Request ID Wiring `backend/routes/calendar.py`	`/extract` endpoint now derives `request_id` from `request.state.request_id` and passes it into `extract_assignments_from_file`.
Wire-Format Validation `backend/tests/evals/syllabus_extraction.py`	Three new evaluators validate adapter output: required keys (`assignments`, `warnings`, `raw_text`), non-null `assignment_type`, and ISO-8601 `due_date` strings. Evaluators registered in `make_dataset()`.
Adapter Unit Tests `backend/tests/test_syllabus_adapter.py`	Comprehensive test suite for `syllabus_to_wire_dict` covering legacy shape contract, default `assignment_type`, `due_date` serialization, `grading_categories` passthrough, empty assignments, and `assignment_dedupe_key` compatibility.
Route Regression Tests `backend/tests/test_calendar_routes.py`	New `TestImportExtractWireFormat` class verifies `POST /api/calendar/extract` returns agent-path and legacy-fallback results with correct wire-format shapes, including `warnings` passthrough and async chain integrity.
Service Integration Tests `backend/tests/test_ocr_pipeline.py`	Restructured Gemini/Supabase gating per-test; new `TestExtractAssignmentsViaAgent` class mocks agent dispatch and validates fallback on guardrail/generic exceptions; added contract test asserting both agent and legacy paths include required keys. `test_full_pipeline` converted to async via `asyncio.run()`.
Refactor Orchestration & ADR `backend/prompts/refactor-4-syllabus-unification/`, `docs/decisions/0016-refactor-4-syllabus-unification-shipped.md`	Comprehensive prompt pack (README, orchestrator overview, five sub-agent task specs, ADR template) documents refactor phases, constraints, and delivery checklist. ADR 0016 records shipped decision (agent-first + fallback), wire-format preservation via adapter, and test coverage.

Sequence Diagram

sequenceDiagram
    participant CalendarRoute as /extract Route
    participant CalendarSvc as calendar_service
    participant Agent as syllabus_extraction_agent
    participant Adapter as syllabus_to_wire_dict
    participant LegacyParse as parse_syllabus<br/>(Gemini)
    participant TextExtract as extract_text()

    CalendarRoute->>CalendarSvc: extract_assignments_from_file()<br/>(file_bytes, filename, content_type,<br/>user_id, request_id)
    
    CalendarSvc->>TextExtract: Extract text from file
    TextExtract-->>CalendarSvc: extracted_text
    
    alt Empty text
        CalendarSvc-->>CalendarRoute: {assignments: [],<br/>warnings: [],<br/>raw_text: ""}
    else Non-empty text
        CalendarSvc->>Agent: _extract_via_agent()<br/>(extracted_text, user_id, request_id)
        
        alt Agent succeeds
            Agent-->>Adapter: SyllabusAssignments
            Adapter-->>CalendarSvc: wire_dict<br/>(assignments, warnings,<br/>raw_text, course_title,<br/>grading_categories)
            CalendarSvc-->>CalendarRoute: wire_dict
        else UsageLimitExceeded /<br/>UnexpectedModelBehavior /<br/>Exception
            CalendarSvc->>LegacyParse: parse_syllabus()<br/>(extracted_text)
            LegacyParse-->>CalendarSvc: legacy_dict<br/>(assignments, warnings, raw_text)
            CalendarSvc->>CalendarSvc: Ensure raw_text,<br/>course_title,<br/>grading_categories present
            CalendarSvc-->>CalendarRoute: enriched_legacy_dict
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

SaplingLearn/Sapling#53: Modifies syllabus assignment wire-format shape and course-field normalization in downstream calendar/document save paths alongside the adapter and calendar_service changes in this PR.

Poem

🐰 The agent hops in, bold and spry,
With typed assignments under the sky,
But when it stumbles or needs to rest,
The legacy path knows how to test—
Wire format holds, no breakage in sight, 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 41.03% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: unifying syllabus extraction onto the agent as refactor `#4`, which is the primary objective of the PR.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering summary, what shipped, migration plan, references, test plan, and rollback instructions. It goes well beyond the minimal template and provides detailed context for this significant refactor.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/4-syllabus-unification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

+
+from datetime import date
+
+import pytest


cloudflare-workers-and-pages · 2026-05-05T04:19:47Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	frontend	`7ca027a`	Commit Preview URL Branch Preview URL	May 05 2026, 04:39 AM

…symmetry Five issues from the self-review on PR #79. 1. CRITICAL: asyncio.run inside async route would 500 in production. `routes/calendar.py::extract` is `async def`. The sync `extract_assignments_from_file` was calling `asyncio.run(_extract_via_agent(...))` from inside the running event loop, which CPython rejects with "cannot be called from a running event loop". Tests missed it because direct unit tests called from sync test bodies (no loop) and the route tests mocked the helper entirely. - services/calendar_service.py: extract_assignments_from_file and process_and_save_syllabus are now `async def`. asyncio.run replaced with `await _extract_via_agent(...)`. `import asyncio` removed. Stale TODO removed. - routes/calendar.py: route now does `await extract_assignments_from_file(...)`. 2. user_id and request_id plumbed end-to-end. Logfire spans for syllabus extractions are now correlatable to a user + a request. routes/calendar.py reads request_id from request.state with current_request_id() fallback (mirrors quiz/learn pattern). process_and_save_syllabus passes user_id through. 3. Wire-format symmetry on the legacy fallback path. Both fallback branches now setdefault course_title=None and grading_categories=[] on the dict returned by parse_syllabus, so consumers see the same shape whether the agent fired or fell back. 4. Tests updated for the async API. 7 asyncio.run wrappings added across test_ocr_pipeline.py — TestExtractAssignmentsViaAgent (4), test_agent_and_legacy_paths_share_required_keys (2), and the live-DB test_full_pipeline (1). 5. Regression test that catches the asyncio.run bug. New test_import_extract_real_async_chain_works mocks only at the AGENT layer (syllabus_extraction_agent.run + extract_text_from_file), then POSTs to the real /api/calendar/extract route. The full chain — route -> service -> _extract_via_agent -> adapter — actually executes. If the awaits get unwired again, this test fails immediately. 597 tests pass on this branch (up from 596 by the +1 integration test). 3 pre-existing live-Supabase failures unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Optional kwarg so existing callers (the live-DB tests in test_ocr_pipeline.py) keep working without change. When a route ever wires this function, it should pass request.state.request_id through so the syllabus run's Logfire spans correlate with the user-facing request — same plumbing the /extract route does. Closes the last in-scope minor issue from PR #79's review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

backend/tests/evals/syllabus_extraction.py (1)

147-165: ⚡ Quick win

Make REQUIRED immutable.

Line 155 is a mutable class attribute on a dataclass, which Ruff is already flagging as RUF012. A frozenset/ClassVar keeps the intent the same and avoids shared mutable state.

Suggested change

+from typing import ClassVar
+
 `@dataclass`
 class WireFormatRequiredKeysEvaluator(Evaluator[str, SyllabusAssignments]):
@@
-    REQUIRED = {"assignments", "warnings", "raw_text"}
+    REQUIRED: ClassVar[frozenset[str]] = frozenset(
+        {"assignments", "warnings", "raw_text"}
+    )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/evals/syllabus_extraction.py` around lines 147 - 165, The class
attribute REQUIRED on WireFormatRequiredKeysEvaluator is mutable; change it to
an immutable class-level constant by declaring it as a frozenset (or annotate
with typing.ClassVar[frozenset]) so it is not shared mutable state on the
dataclass; update the REQUIRED definition in the WireFormatRequiredKeysEvaluator
class accordingly and ensure any uses (e.g., self.REQUIRED.issubset(...))
continue to work.

backend/tests/test_calendar_routes.py (1)

283-297: ⚡ Quick win

Use factories instead of mutable class-level payloads.

AGENT_RESULT and LEGACY_RESULT are shared mutable dicts, and the current dict(...) copies are only shallow. Ruff is already flagging this as RUF012; returning a fresh payload per test avoids cross-test leakage.

Suggested change

 class TestImportExtractWireFormat:
     """Pins the wire format of POST /api/calendar/extract."""
 
-    AGENT_RESULT = {
-        "assignments": [
-            {
-                "title": "Lab 7: Recursion",
-                "due_date": "2026-03-15",
-                "assignment_type": "other",
-                "notes": "Hands-on recursion lab.",
-                "weight_pct": 10.0,
-            }
-        ],
-        "warnings": [],
-        "raw_text": "syllabus body",
-        "course_title": "CS 101",
-        "grading_categories": [{"name": "Labs", "weight": 0.4}],
-    }
-
-    LEGACY_RESULT = {
-        "assignments": [
-            {
-                "title": "HW1",
-                "due_date": "2026-03-01",
-                "assignment_type": "homework",
-                "notes": None,
-            }
-        ],
-        "warnings": [],
-        "raw_text": "fallback text",
-    }
+    `@staticmethod`
+    def _agent_result():
+        return {
+            "assignments": [
+                {
+                    "title": "Lab 7: Recursion",
+                    "due_date": "2026-03-15",
+                    "assignment_type": "other",
+                    "notes": "Hands-on recursion lab.",
+                    "weight_pct": 10.0,
+                }
+            ],
+            "warnings": [],
+            "raw_text": "syllabus body",
+            "course_title": "CS 101",
+            "grading_categories": [{"name": "Labs", "weight": 0.4}],
+        }
+
+    `@staticmethod`
+    def _legacy_result():
+        return {
+            "assignments": [
+                {
+                    "title": "HW1",
+                    "due_date": "2026-03-01",
+                    "assignment_type": "homework",
+                    "notes": None,
+                }
+            ],
+            "warnings": [],
+            "raw_text": "fallback text",
+        }

Also applies to: 299-310

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/test_calendar_routes.py` around lines 283 - 297, AGENT_RESULT
and LEGACY_RESULT are module-level mutable dicts causing cross-test state;
replace them with factory functions (e.g., make_agent_result() and
make_legacy_result()) that return a fresh dict each call, update tests in
test_calendar_routes.py to call these factories instead of referencing the
module constants, and if any test needs a modified version use
copy.deepcopy(...) or construct the dict from the factory to avoid shallow-copy
pitfalls.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/prompts/refactor-4-syllabus-unification/00-orchestrator-overview.md`:
- Around line 61-66: The documented wire-format example incorrectly lists
course_id as part of each assignment; update the example in
00-orchestrator-overview.md so the adapter output matches the actual contract
(as produced by extract_assignments_from_file and consumed by
routes/calendar.py) by removing course_id from the assignment shape and keeping
the rule that missing fields (e.g., assignment_type) get a sensible default
("other"); ensure the text references the adapter/consumer symbols
(extract_assignments_from_file, routes/calendar.py) and the assignment field
names ({title, due_date, assignment_type, notes}) to avoid future drift.

In `@backend/prompts/refactor-4-syllabus-unification/05-sub-agent-E-cleanup.md`:
- Around line 83-86: The grep command is searching "backend/" after you already
cd'd into the backend root, which can cause false negatives; update the command
string "grep -rn
\"parse_syllabus\\|process_and_save_syllabus\\|prompts/syllabus_extraction.txt\"
backend/ --include=\"*.py\" --include=\"*.txt\"" to search the current directory
(e.g. replace backend/ with . or remove the explicit path) so it reliably
inspects files from the backend root; keep the same include patterns and flags.

In `@backend/services/calendar_service.py`:
- Around line 141-157: The fallback except branches that call
parse_syllabus(text) must also ensure the returned result always contains a
warnings key; update both exception handlers (the block catching
UsageLimitExceeded/UnexpectedModelBehavior and the generic Exception block where
parse_syllabus is called and result is assigned) to call
result.setdefault("warnings", []) in addition to the existing setdefault calls
for "raw_text", "course_title", and "grading_categories" so the API shape is
consistent regardless of which branch runs.

---

Nitpick comments:
In `@backend/tests/evals/syllabus_extraction.py`:
- Around line 147-165: The class attribute REQUIRED on
WireFormatRequiredKeysEvaluator is mutable; change it to an immutable
class-level constant by declaring it as a frozenset (or annotate with
typing.ClassVar[frozenset]) so it is not shared mutable state on the dataclass;
update the REQUIRED definition in the WireFormatRequiredKeysEvaluator class
accordingly and ensure any uses (e.g., self.REQUIRED.issubset(...)) continue to
work.

In `@backend/tests/test_calendar_routes.py`:
- Around line 283-297: AGENT_RESULT and LEGACY_RESULT are module-level mutable
dicts causing cross-test state; replace them with factory functions (e.g.,
make_agent_result() and make_legacy_result()) that return a fresh dict each
call, update tests in test_calendar_routes.py to call these factories instead of
referencing the module constants, and if any test needs a modified version use
copy.deepcopy(...) or construct the dict from the factory to avoid shallow-copy
pitfalls.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d9cc9142-3cbf-4ad2-a039-9eb70e1a9322

📥 Commits

Reviewing files that changed from the base of the PR and between dacf813 and d2e7020.

📒 Files selected for processing (16)

backend/agents/tools/syllabus_adapter.py
backend/prompts/refactor-4-syllabus-unification/00-orchestrator-overview.md
backend/prompts/refactor-4-syllabus-unification/01-sub-agent-A-adapter.md
backend/prompts/refactor-4-syllabus-unification/02-sub-agent-B-service.md
backend/prompts/refactor-4-syllabus-unification/03-sub-agent-C-routes.md
backend/prompts/refactor-4-syllabus-unification/04-sub-agent-D-evals.md
backend/prompts/refactor-4-syllabus-unification/05-sub-agent-E-cleanup.md
backend/prompts/refactor-4-syllabus-unification/06-adr-template.md
backend/prompts/refactor-4-syllabus-unification/README.md
backend/routes/calendar.py
backend/services/calendar_service.py
backend/tests/evals/syllabus_extraction.py
backend/tests/test_calendar_routes.py
backend/tests/test_ocr_pipeline.py
backend/tests/test_syllabus_adapter.py
docs/decisions/0016-refactor-4-syllabus-unification-shipped.md

coderabbitai · 2026-05-05T04:36:33Z

+- **Wire format unchanged.** `routes/calendar.py` reads
+  `{"assignments": [{title, due_date, assignment_type, notes, course_id}], "warnings": [...], "raw_text": ...}`
+  from `extract_assignments_from_file`. The adapter must produce
+  exactly that shape; if a field is genuinely not in the agent's
+  schema (`assignment_type` is the only one to watch), the adapter
+  fills a sensible default (`"other"`) rather than dropping the field.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the wire-format contract example (course_id is not part of adapter output)

At Line 62, the documented assignment shape includes course_id, but the implemented adapter/tested contract does not emit that key. This mismatch can misdirect future refactors and tests.

Suggested fix

-- **Wire format unchanged.** `routes/calendar.py` reads - `{"assignments": [{title, due_date, assignment_type, notes, course_id}], "warnings": [...], "raw_text": ...}` +- **Wire format unchanged.** `routes/calendar.py` reads + `{"assignments": [{title, due_date, assignment_type, notes, weight_pct}], "warnings": [...], "raw_text": ...}`

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/prompts/refactor-4-syllabus-unification/00-orchestrator-overview.md` around lines 61 - 66, The documented wire-format example incorrectly lists course_id as part of each assignment; update the example in 00-orchestrator-overview.md so the adapter output matches the actual contract (as produced by extract_assignments_from_file and consumed by routes/calendar.py) by removing course_id from the assignment shape and keeping the rule that missing fields (e.g., assignment_type) get a sensible default ("other"); ensure the text references the adapter/consumer symbols (extract_assignments_from_file, routes/calendar.py) and the assignment field names ({title, due_date, assignment_type, notes}) to avoid future drift.

coderabbitai · 2026-05-05T04:36:33Z

+cd "/Users/josegaelcruzlopez/Documents/Startup_Projects /Sapling/backend"
+python -m pytest tests/ -q --no-header --ignore=tests/evals
+grep -rn "parse_syllabus\|process_and_save_syllabus\|prompts/syllabus_extraction.txt" backend/ --include="*.py" --include="*.txt"
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the verification grep path to avoid false negatives

At Line 83 you already cd into the backend root, so Line 85 should not search backend/ again. As written, it can silently miss matches and incorrectly validate cleanup completion.

Suggested fix

cd "/Users/josegaelcruzlopez/Documents/Startup_Projects /Sapling/backend" python -m pytest tests/ -q --no-header --ignore=tests/evals -grep -rn "parse_syllabus\|process_and_save_syllabus\|prompts/syllabus_extraction.txt" backend/ --include="*.py" --include="*.txt" +grep -rn "parse_syllabus\|process_and_save_syllabus\|prompts/syllabus_extraction.txt" . --include="*.py" --include="*.txt"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cd "/Users/josegaelcruzlopez/Documents/Startup_Projects /Sapling/backend"

python -m pytest tests/ -q --no-header --ignore=tests/evals

grep -rn "parse_syllabus\|process_and_save_syllabus\|prompts/syllabus_extraction.txt" backend/ --include="*.py" --include="*.txt"

```

cd "/Users/josegaelcruzlopez/Documents/Startup_Projects /Sapling/backend"

python -m pytest tests/ -q --no-header --ignore=tests/evals

grep -rn "parse_syllabus\|process_and_save_syllabus\|prompts/syllabus_extraction.txt" . --include="*.py" --include="*.txt"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/prompts/refactor-4-syllabus-unification/05-sub-agent-E-cleanup.md` around lines 83 - 86, The grep command is searching "backend/" after you already cd'd into the backend root, which can cause false negatives; update the command string "grep -rn \"parse_syllabus\\|process_and_save_syllabus\\|prompts/syllabus_extraction.txt\" backend/ --include=\"*.py\" --include=\"*.txt\"" to search the current directory (e.g. replace backend/ with . or remove the explicit path) so it reliably inspects files from the backend root; keep the same include patterns and flags.

coderabbitai · 2026-05-05T04:36:33Z

+    except (UsageLimitExceeded, UnexpectedModelBehavior) as e:
+        logger.warning(
+            "Syllabus agent guardrails tripped; falling back to legacy",
+            exc_info=e,
+        )
+        result = parse_syllabus(text)
+        result.setdefault("raw_text", text)
+        result.setdefault("course_title", None)
+        result.setdefault("grading_categories", [])
+    except Exception:
+        logger.exception(
+            "Unexpected syllabus-agent failure; falling back to legacy"
+        )
+        result = parse_syllabus(text)
+        result.setdefault("raw_text", text)
+        result.setdefault("course_title", None)
+        result.setdefault("grading_categories", [])


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Backfill warnings in the fallback branches too.

Both fallback paths normalize raw_text, course_title, and grading_categories, but they still return a shape without warnings if parse_syllabus() omits it. That makes the API contract depend on which path ran.

Suggested change

except (UsageLimitExceeded, UnexpectedModelBehavior) as e: logger.warning( "Syllabus agent guardrails tripped; falling back to legacy", exc_info=e, ) result = parse_syllabus(text) + result.setdefault("warnings", []) result.setdefault("raw_text", text) result.setdefault("course_title", None) result.setdefault("grading_categories", []) except Exception: logger.exception( "Unexpected syllabus-agent failure; falling back to legacy" ) result = parse_syllabus(text) + result.setdefault("warnings", []) result.setdefault("raw_text", text) result.setdefault("course_title", None) result.setdefault("grading_categories", [])

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/services/calendar_service.py` around lines 141 - 157, The fallback except branches that call parse_syllabus(text) must also ensure the returned result always contains a warnings key; update both exception handlers (the block catching UsageLimitExceeded/UnexpectedModelBehavior and the generic Exception block where parse_syllabus is called and result is assigned) to call result.setdefault("warnings", []) in addition to the existing setdefault calls for "raw_text", "course_title", and "grading_categories" so the API shape is consistent regardless of which branch runs.

github-code-quality Bot found potential problems May 5, 2026

View reviewed changes

Comment thread backend/tests/test_syllabus_adapter.py

from datetime import date

import pytest

Jose-Gael-Cruz-Lopez and others added 2 commits May 5, 2026 00:31

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Jose-Gael-Cruz-Lopez merged commit a8a6243 into main May 5, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(calendar): unify syllabus extraction onto agent (refactor #4)#79

refactor(calendar): unify syllabus extraction onto agent (refactor #4)#79
Jose-Gael-Cruz-Lopez merged 3 commits into
mainfrom
refactor/4-syllabus-unification

Jose-Gael-Cruz-Lopez commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

cloudflare-workers-and-pages Bot commented May 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 5, 2026

Uh oh!

coderabbitai Bot May 5, 2026

Uh oh!

coderabbitai Bot May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jose-Gael-Cruz-Lopez commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What shipped

Prompt hash

What's NOT in this PR (per ADR 0001 migration contract)

Migration plan complete

References

Test plan

Rollback

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

cloudflare-workers-and-pages Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jose-Gael-Cruz-Lopez commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented May 5, 2026 •

edited

Loading