Skip to content

Conversation

@TylerHillery
Copy link
Contributor

@TylerHillery TylerHillery commented Nov 18, 2025

What kind of change does this PR introduce?

The storage.migrations table can sometimes fall out of sync, requiring migrations to be replayed on databases that already applied them. This PR makes all migrations fully idempotent, allowing you to clear storage.migrations and safely rerun the entire migration chain without errors, ensuring the final database state matches a clean first-run.

Current behavior

  • Several functions use drop function if exists without cascade causing issues when it has dependencies like RLS policies
  • Some roles are created without checking for existence
  • migrations/tenant/0038-iceberg-catalog-flag-on-buckets.sql creates an index on bucket_id without verifying that the column exists. This column gets renamed later on so it causes issues on the second run.
  • iceberg catalog id fks only get created if the catalog_id column doesn't exist, so if you run the migration again after clearing out the migrations table it's going to drop the fk constraint but it wont recreate it because the column is still there.

New behavior

  • My initial logic for creating the functions in an idempotent way is tricky because create or replace is already idempotent but it doesn't work if the return type of the function changes like it does with storage.get_size_by_bucket() so this needs to be dropped first. Dropping first introduces a problem though if there something else dependent on that function. If there is we can't drop it without cascade but that means we would remove anything dependent on that function including RLS policies. In general I think we should avoid cascade where we can and rely on create or replace if we make updates to a function such as changing return type we might want to add a suffix with _v2 instead

  • Added existence checks when creating roles (same approach as super_user)

  • Updated 0038-iceberg-catalog-flag-on-buckets.sql to check for bucket_id before creating the index, since it’s renamed to bucket_name in migration 0048

  • Added step in CI to drop the migrations table and rerun migrations to ensure we don't generate any new migrations that are not idempotent. I run pg_dump after the initial migration, then delete the migrations table and rerun the migrations taking another pg_dump comparing the two. I ignore the storage.migrations table as the times the migrations took place are always going to differ.

  • Separate the alter table add column catalog_id for both iceberg_namespaces and iceberg_tables so that the fk creation happens in a separate statement to ensure it gets created no matter how many times you run the migrations in a row by first checking if doesn't exist.

Additional context

@TylerHillery TylerHillery force-pushed the tyler/STORAGE-292/amend-migrations-to-prevent-crashes branch from f0ae728 to d504e26 Compare November 18, 2025 18:22
@coveralls
Copy link

coveralls commented Nov 18, 2025

Pull Request Test Coverage Report for Build 19486459611

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.01%) to 75.913%

Files with Coverage Reduction New Missed Lines %
src/internal/database/tenant.ts 2 83.65%
Totals Coverage Status
Change from base Build 19426789519: -0.01%
Covered Lines: 25386
Relevant Lines: 33164

💛 - Coveralls

@TylerHillery TylerHillery force-pushed the tyler/STORAGE-292/amend-migrations-to-prevent-crashes branch from ffbb2b0 to 637ddef Compare November 18, 2025 22:56
@TylerHillery TylerHillery requested review from fenos and itslenny and removed request for fenos November 19, 2025 01:41
@TylerHillery TylerHillery marked this pull request as ready for review November 19, 2025 01:41
@TylerHillery TylerHillery requested a review from fenos November 19, 2025 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants