Skip to content

fix(postgres): create content trigram index with fastupdate=off#606

Open
flexiondotorg wants to merge 2 commits into
kenn-io:mainfrom
flexiondotorg:fix/gin-trgm-fastupdate-off
Open

fix(postgres): create content trigram index with fastupdate=off#606
flexiondotorg wants to merge 2 commits into
kenn-io:mainfrom
flexiondotorg:fix/gin-trgm-fastupdate-off

Conversation

@flexiondotorg

@flexiondotorg flexiondotorg commented Jun 8, 2026

Copy link
Copy Markdown

Closes #605.

Adds WITH (fastupdate = off) to the idx_messages_content_trgm creation in createContentSearchIndexesPG, and reapplies it idempotently via ALTER INDEX on every schema bootstrap so stores upgraded from a schema that created the index with the default fastupdate=on are covered too. A pgtest assertion guards reloptions. See #605 for the diagnosis and impact.

Validation: recreating the index with fastupdate=off reduced the same data from 283 GB to 388 MB.

Existing over-sized indexes stop growing immediately; a one-time REINDEX reclaims space already consumed.

idx_messages_content_trgm is created with PostgreSQL's default
fastupdate=on. The resulting GIN pending list is only merged into the
index by VACUUM, so under continuous `pg sync` ingest where long-lived
transactions pin the xmin horizon and starve autovacuum, it grows
without bound. In one deployment the index bloated to 283 GB for ~213k
rows and filled the disk, crashing PostgreSQL with ENOSPC.

Disabling fastupdate merges entries directly into the tree, keeping the
index bounded and predictable at a small per-insert cost.
@roborev-ci

roborev-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown

roborev: Combined Review (9522d5b)

The PR has one medium-severity schema migration issue; no security findings were reported.

Medium

  • internal/postgres/schema.go:697: CREATE INDEX IF NOT EXISTS only applies WITH (fastupdate = off) when idx_messages_content_trgm is first created. Existing PostgreSQL stores with the prior index will retain the default fastupdate=on, so the bounded-index fix will not take effect after upgrade.
    • Suggested fix: After ensuring the index exists, run an idempotent ALTER INDEX idx_messages_content_trgm SET (fastupdate = off) and add or extend a PostgreSQL schema test to assert reloptions includes fastupdate=off.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 39s), codex_security (codex/security, done, 19s) | Total: 1m5s

CREATE INDEX IF NOT EXISTS only applies WITH (fastupdate = off) on first
creation, so stores upgraded from an earlier schema retained the default
fastupdate=on. Reapply idempotently with ALTER INDEX on every schema
bootstrap, and assert reloptions in the pgtest suite.

Addresses review feedback on kenn-io#606.
@roborev-ci

roborev-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown

roborev: Combined Review (3f36eef)

Synthesis unavailable. Showing individual review outputs.

codex — default (done)

Review Findings

  • Severity: Low
  • Location: internal/postgres/search_content_pgtest_test.go:485
  • Problem: The added assertion only covers fresh schema creation, where CREATE INDEX ... WITH (fastupdate = off) is enough to pass. It does not exercise the upgrade path where an existing idx_messages_content_trgm was created with the prior default fastupdate=on, so the new ALTER INDEX behavior could regress unnoticed.
  • Fix: Add a pgtest that creates or resets the trigram index with fastupdate=on, reruns EnsureSchema, and asserts reloptions contains fastupdate=off.

Summary

This change creates the PostgreSQL content trigram index with fastupdate=off and reapplies that setting during schema setup.


codex — security (done)

Summary: The diff only changes PostgreSQL trigram index options and adds test coverage. No auth, input handling, secrets, filesystem, subprocess, or CI/CD surfaces are affected, and the formatted DDL continues to use a quoted catalog-derived schema name.

No issues found.

@mariusvniekerk

mariusvniekerk commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

So whats the performance impact of this change?

Since the alternate would be something like running REINDEX CONCURRENTLY on some cadence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: GIN trigram index on messages.content bloats unbounded (fastupdate=on), fills disk

2 participants