Conversation
Async Supabase Edge Function that drains the entity_extraction_queue, calling an LLM to extract people, projects, topics, tools, orgs, and places from thoughts, then building a knowledge graph via entities, edges, and thought_entities tables. Features: batch processing with atomic claiming, retry/backoff with poison-item handling (max 5 attempts), dry-run mode, symmetric edge dedup, system-generated thought skipping. OB1 adaptations: OpenRouter-first LLM provider order, wildcard CORS, model constants from _shared/config.ts, thoughts table (not brain_thoughts). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a72f6cb8e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const dryRun = url.searchParams.get("dry_run") === "true"; | ||
|
|
||
| // Step 1: Claim queue items | ||
| const claimed = await claimQueueItems(limit); |
There was a problem hiding this comment.
Avoid claiming queue items during dry runs
When dry_run=true, the handler still executes claimQueueItems(limit), which updates queue rows to processing; later the dry-run branch exits without calling markComplete or markError. This means a preview request mutates production queue state and can leave items stuck in processing, so subsequent real runs will skip them until a manual reset.
Useful? React with 👍 / 👎.
| return []; | ||
| } | ||
|
|
||
| return pending; |
There was a problem hiding this comment.
Return only queue rows actually claimed
This function returns the originally selected pending rows even though the claim update is a separate statement; under concurrent workers, one worker can select rows, fail to update any of them because another worker claimed first, and still process those thoughts. That creates duplicate extraction work and can inflate graph edge support counts.
Useful? React with 👍 / 👎.
|
|
||
| if (thoughtError || !thought?.content) { | ||
| console.error(`Failed to fetch thought ${item.thought_id}:`, thoughtError); | ||
| if (!dryRun) await markError(item.thought_id, thoughtError?.message ?? "Thought not found", 0); |
There was a problem hiding this comment.
Preserve retry count on thought fetch failures
On thought lookup failure, markError is always called with attemptCount hardcoded to 0, so attempt_count is repeatedly reset to 1 instead of incrementing across retries. For missing/deleted thoughts this prevents reaching MAX_ATTEMPTS, causing perpetual requeueing instead of eventual terminal failed status.
Useful? React with 👍 / 👎.
Add blank lines around headings (MD022), fenced code blocks (MD031), and between adjacent blockquotes (MD028). Fix broken link fragment (MD051) and remove extra blank line (MD012). No content changes. These errors were blocking CI on all open PRs since the lint check runs repo-wide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each section's numbered list now restarts at 1 instead of continuing the global count (3-14), satisfying markdownlint MD029/ol-prefix rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- dry_run now uses peekQueueItems() (read-only SELECT) instead of claimQueueItems(), so items stay "pending" during preview runs - claimQueueItems() returns only rows actually claimed via .select(), preventing race conditions where concurrent workers see stale results - markError() clears started_at and worker_version when resetting to "pending" so retryable items don't appear stale in monitoring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
entity_extraction_queueto build a knowledge graphschemas/knowledge-graph(PR [schemas] Knowledge graph tables and extraction trigger #5) for entities, edges, thought_entities, and queue tablesWhat It Does
Processes pending items from the extraction queue in batches. For each thought, calls an LLM to extract named entities (person, project, topic, tool, organization, place) and relationships (works_on, uses, related_to, etc.), then upserts into the graph tables.
Key Features
metadata.generated_byFiles
index.ts_shared/helpers.ts_shared/config.tsREADME.mdmetadata.jsondeno.jsonTest plan
🤖 Generated with Claude Code