[integrations] Consolidation workers (bio + metadata)#14
[integrations] Consolidation workers (bio + metadata)#14alanshurafa wants to merge 7 commits intomainfrom
Conversation
Bio worker synthesizes canonical biographical profiles from person_note, decision, and journal thoughts. Metadata normalization worker reclassifies thoughts with weak metadata (catch-all type, default importance, low confidence) via LLM with materiality and confidence guards. Both workers use OpenRouter-first three-tier LLM fallback, support dry-run mode, log to consolidation_log, and use wildcard CORS for flexible deployment. Depends on: schemas/enhanced-thoughts, schemas/knowledge-graph Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add blank lines around headings (MD022), fenced code blocks (MD031), and between adjacent blockquotes (MD028). Fix broken link fragment (MD051) and remove extra blank line (MD012). No content changes. These errors were blocking CI on all open PRs since the lint check runs repo-wide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a487268b56
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const { data: decisions } = await supabase | ||
| .from("thoughts") | ||
| .select("id, content, type, importance, created_at") | ||
| .eq("type", "decision") | ||
| .gte("importance", 4) |
There was a problem hiding this comment.
Restrict non-note sources by requested person name
When ?name= is provided, gatherSourceThoughts still pulls all high-importance decision rows (and the journal query below is also unconditional), so the profile prompt can include unrelated people/context and produce an inaccurate "Who is X" profile. This breaks the targeted-person mode whenever the brain contains mixed-person data.
Useful? React with 👍 / 👎.
| for (const { name, fn } of providers) { | ||
| try { | ||
| return await fn(); | ||
| } catch (err) { |
There was a problem hiding this comment.
Fall through to next LLM on empty primary output
The fallback loop returns immediately on the first provider that does not throw, even if it returns an empty or unusable body. In that case reclassifyThought gets empty/non-JSON text and the thought is skipped or errors out, while configured fallback providers are never attempted, reducing reclassification reliability.
Useful? React with 👍 / 👎.
| .eq("metadata->>generated_by", "consolidation-bio") | ||
| .eq("metadata->>artifact_type", "biographical_profile") |
There was a problem hiding this comment.
Scope existing-profile lookup to the requested person
The existing-profile query ignores the name filter and always returns the latest biographical profile globally. If operators run consolidation-bio for different ?name= values, each run updates the same row and overwrites the previous person’s profile instead of keeping separate artifacts.
Useful? React with 👍 / 👎.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
bio/index.ts): Synthesizes canonical biographical profiles from person_note, decision, and journal thoughts via LLM. Updates existing profiles in place on subsequent runs.metadata-norm/index.ts): Finds thoughts with weak metadata (catch-all type, default importance, low confidence) and reclassifies via LLM with materiality and confidence guards (> 0.8 confidence, material change required).consolidation_log._shared/) copied from the enhanced-mcp integration (PR 5).Dependencies
schemas/enhanced-thoughts(PR 1) — for type, importance, sensitivity_tier columnsschemas/knowledge-graph(PR 4) — for consolidation_log tableGate compliance
Test plan
?dry_run=trueand verify profile output?dry_run=true&limit=5and verify reclassification results