fix(search): make FTS sync idempotent to stop duplicate rows on backfill#61
Merged
tompscanlan merged 2 commits intoJun 17, 2026
Merged
Conversation
buildFtsStatements only deleted the existing FTS row before inserting when the record was already in existingMap. Backfill runs with skipReplayDetection, leaving existingMap empty, so re-applied records looked new and appended duplicate FTS rows. The FTS5 virtual table has no uniqueness constraint, so duplicates accumulated and the search JOIN fanned each event out into one row per duplicate, breaking keyed lists in the appview. Delete-then-insert is now unconditional.
25e29ce to
32ace91
Compare
… fields The idempotent FTS sync returned early when buildFtsContent produced no content, skipping the delete. An update that cleared every searchable field therefore left the prior FTS row in place, so old terms kept matching through the search JOIN. Run the delete unconditionally and gate only the re-insert on content.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Makes contrail's D1 full-text-search sync idempotent so repeated backfills stop accumulating duplicate FTS rows.
Root cause
buildFtsStatementsonly deleted the existing FTS row before inserting when the record was already inexistingMap. Backfill runs withskipReplayDetection, which leavesexistingMapempty, so every re-applied record looks brand-new and appends another row to thefts_<coll>virtual table (FTS5 has no uniqueness constraint). Re-running a backfill therefore accumulates duplicate FTS rows, and the search JOIN (JOIN fts ON fts.uri = r.uri) then fans each matching record out into one result row per duplicate.Downstream this surfaced as a hard failure in an appview consumer (atmo): a keyed list (
{#each results as r (r.uri)}) throws on the duplicate keys and blanks the page. The records table is unaffected because it usesINSERT ... ON CONFLICT(uri) DO UPDATE.Fix
Make the delete-then-insert unconditional so FTS sync is idempotent regardless of replay detection, and drop the now-unused
existingMapparameter frombuildFtsStatements.Testing
skipReplayDetectionreturned 2 rows fromqueryRecords(RED), then 1 after the fix (GREEN).@atmo-dev/contrailsuite: 438 passed, 3 skipped (Postgres-only); typecheck clean.fts_eventrows, and the search JOIN for previously-fanning terms now returnsrows == distinct_uris.Note for existing indexes
Deployments that already accumulated duplicates need a one-time cleanup; this fix prevents new ones but does not retro-clean. Run it after deploying the fix, or duplicates re-accumulate.
There is one FTS5 virtual table per searchable collection, named
fts_<short>(plusspaces_fts_<short>when spaces mode is used). List the ones that exist:Then dedupe each, keeping the lowest rowid per uri. For the default events config that is a single table:
(The fts5 shadow tables
*_data/*_idx/*_docsize/*_configare maintained automatically; only delete from the virtual table itself.)