Fix: Group transaction changes by table #3360

singhpk234 · 2026-01-06T02:38:08Z

Problem

. Groups ALL changes by table- even if they appear randomly in the input like [A1, B1, A2, C1, A3] → groups to {A:
[A1, A2, A3], B: [B1], C: [C1]}
2. For each table, processes changes sequentially :
- Validate R1 against base metadata → Apply U1 → update currentMetadata
- Validate R2 against updated metadata → Apply U2 → update currentMetadata
- Validate R3 against updated metadata → Apply U3 → update currentMetadata
3. Single commit per table prevents duplicate entity IDs in pendingUpdates

This ensures:

Each change's requirements validate against the evolved state
Updates are applied in order within each table
Only one commit per table (solves the original problem)
Conflicts detected early (e.g., second change expecting schema ID 0 fails when it's already 1)

related discussion: #3352 (comment)

Checklist

🛡️ Don't disclose security issues! (contact [email protected])
🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
🧪 Added/updated tests with good coverage, or manually tested (and explained how)
💡 Added comments for complex logic
🧾 Updated CHANGELOG.md (if needed)
📚 Updated documentation in site/content/in-dev/unreleased (if needed)

dimas-b · 2026-01-06T18:01:15Z

nit: Only one commit per table (solves the original problem) - I'd simply say Only one commit per table. This will end up in the git commit log, where the "original problem" might be hard to understand later on... It's also fine to give a summary of the problem instead of the "original" reference if you prefer.

dimas-b

Nice fix. Thanks, @singhpk234 !

Just a couple of minor comments :)

dimas-b · 2026-01-06T19:09:45Z

.../service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java

-              if (!updatedMetadata.changes().isEmpty()) {
-                tableOps.commit(currentMetadata, updatedMetadata);
+    // Process each table's changes in order
+    changesByTable.forEach(


The new code logic looks correct to me. I think it's a worthy change to merge in its own right.

However, as for the issue discussed in #3352 (comment) , I think this fix is effective, but it's not apparent that it will work correctly.

The basic problem is that Persistence is called with multiple entity objects for the same ID. This means that the updateEntitiesPropertiesIfNotChanged call on line must contain at most one entity per ID. This may or may not be true depending on what the catalog forwards to transactionMetaStoreManager for each commit on line 1108.

I do believe that one tableOps.commit() will result in one entity update, so it may be sufficient to just add a comment about that around line 1113. WDYT?

Yeah, logic also looks correct to me. +1 to adding a comment on the subtlety though that we're coalescing all the updates for a given table into a single Polaris entity update, which is a slightly different behavior than if the caller expected the various UpdateTableRequests in this commitTransaction to really behave as if they were each applied independently (but as if under a lock).

Note that this issue was a known limitation, and referenced in a TODO in TransactionWorkspaceMetaStoreManager:

polaris/polaris-core/src/main/java/org/apache/polaris/core/persistence/TransactionWorkspaceMetaStoreManager.java

Line 84 in 8abf19a

// TODO: If we want to support the semantic of opening a transaction in which multiple

// TODO: If we want to support the semantic of opening a transaction in which multiple // reads and writes occur on the same entities, where the reads are expected to see the writes // within the transaction workspace that haven't actually been committed, we can augment this // class by allowing these pendingUpdates to represent the latest state of the entity if we // also increment entityVersion. We'd need to store both a "latest view" of all updated entities // to serve reads within the same transaction while also storing the ordered list of // pendingUpdates that ultimately need to be applied in order within the real MetaStoreManager.

The alternative "fix" described there that is more general but more complex and probably has pitfalls is to really queue up the sequential mutations per entity in that "uncommitted persistence layer".

The main implications would be that if we plug into the MetaStoreManager layer, we can intercept but inherit other relevant hooks, such as generating events, having entityVersion increments directly match actual update requests, etc.

But I'm in favor of this more targeted change-coalescing fix here for now. We could either update/remove the TODO in TransactionWorkspaceMetaStoreManager and/or leave a comment in the code here referencing the other approach so any future changes to the way we handle these can more easily sort through it.

flyrain · 2026-01-06T19:13:17Z

.../service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java

+        throw new BadRequestException(
+            "Unsupported operation: commitTranaction with updateForStagedCreate: %s", change);
+      }
+      changesByTable.computeIfAbsent(change.identifier(), k -> new ArrayList<>()).add(change);


Should we check and throw in case the table exists already? It'd also be nice to have a test case.

Does the Iceberg REST Catalog spec allow more than one table update in one commitTransaction operation?

The spec doesn't mention the uniqueness of the table identifier,

polaris/spec/iceberg-rest-catalog-open-api.yaml

Line 3499 in 70ad92f

CommitTransactionRequest:

.

Within one CommitTableRequest, multiple updates are allowed,

polaris/spec/iceberg-rest-catalog-open-api.yaml

Line 3478 in 70ad92f

type: array

, which should apply to the same snapshot.

With that, I think it's undefined behavior whether the same table identifier could duplicate. We could explicitly disable it in Polaris though.

I'd rather not disable this in Polaris unless the spec explicitly flagged it as a disallowed use case (which it did not).

In lieu of an explicit spec, applying multiple updates in sequence is a fairly straight-forward operation since each update is self-contained and will be validated against the current metadata.

dennishuo · 2026-01-07T05:12:26Z

.../service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java

-              if (!updatedMetadata.changes().isEmpty()) {
-                tableOps.commit(currentMetadata, updatedMetadata);
+    // Process each table's changes in order
+    changesByTable.forEach(


Yeah, logic also looks correct to me. +1 to adding a comment on the subtlety though that we're coalescing all the updates for a given table into a single Polaris entity update, which is a slightly different behavior than if the caller expected the various UpdateTableRequests in this commitTransaction to really behave as if they were each applied independently (but as if under a lock).

Note that this issue was a known limitation, and referenced in a TODO in TransactionWorkspaceMetaStoreManager:

polaris/polaris-core/src/main/java/org/apache/polaris/core/persistence/TransactionWorkspaceMetaStoreManager.java

Line 84 in 8abf19a

// TODO: If we want to support the semantic of opening a transaction in which multiple

// TODO: If we want to support the semantic of opening a transaction in which multiple // reads and writes occur on the same entities, where the reads are expected to see the writes // within the transaction workspace that haven't actually been committed, we can augment this // class by allowing these pendingUpdates to represent the latest state of the entity if we // also increment entityVersion. We'd need to store both a "latest view" of all updated entities // to serve reads within the same transaction while also storing the ordered list of // pendingUpdates that ultimately need to be applied in order within the real MetaStoreManager.

The alternative "fix" described there that is more general but more complex and probably has pitfalls is to really queue up the sequential mutations per entity in that "uncommitted persistence layer".

The main implications would be that if we plug into the MetaStoreManager layer, we can intercept but inherit other relevant hooks, such as generating events, having entityVersion increments directly match actual update requests, etc.

But I'm in favor of this more targeted change-coalescing fix here for now. We could either update/remove the TODO in TransactionWorkspaceMetaStoreManager and/or leave a comment in the code here referencing the other approach so any future changes to the way we handle these can more easily sort through it.

dennishuo · 2026-01-07T05:22:59Z

.../service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java

              }
-            });
+
+              // Apply updates to builder


Maybe add a // TODO to refactor this to better share/reconcile with the update-application logic in CatalogHandlerUtils (I understand this divergence was already latent and not introduced by this PR, but as the update-application logic grows in complexity here it's going to start getting a lot worse).

adnanhemani

Awesome change!

github-project-automation bot added this to Basic Kanban Board Jan 6, 2026

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jan 6, 2026

singhpk234 mentioned this pull request Jan 6, 2026

RELATIONAL-JDBC: Add support for cockroach DB #3352

Open

6 tasks

singhpk234 changed the title ~~Fix: Group transaction changes by table in a transaction~~ Fix: Group transaction changes by table Jan 6, 2026

singhpk234 requested review from dennishuo, dimas-b and flyrain January 6, 2026 02:39

sfc-gh-prsingh force-pushed the feature/optimize-transaction branch from 3035e74 to 2380846 Compare January 6, 2026 02:50

Fix: Iceberg transaction

e0f05c0

sfc-gh-prsingh force-pushed the feature/optimize-transaction branch from 2380846 to e0f05c0 Compare January 6, 2026 02:53

dimas-b approved these changes Jan 6, 2026

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jan 6, 2026

flyrain reviewed Jan 6, 2026

View reviewed changes

dennishuo approved these changes Jan 7, 2026

View reviewed changes

adnanhemani approved these changes Jan 7, 2026

View reviewed changes

Fix: Group transaction changes by table #3360

Are you sure you want to change the base?

Fix: Group transaction changes by table #3360

Uh oh!

Conversation

singhpk234 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Checklist

Uh oh!

dimas-b commented Jan 6, 2026

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

dimas-b Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

dennishuo Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

flyrain Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

flyrain Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

dennishuo Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

dennishuo Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

adnanhemani left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

singhpk234 commented Jan 6, 2026 •

edited

Loading

flyrain Jan 6, 2026 •

edited

Loading

dimas-b Jan 6, 2026 •

edited

Loading

flyrain Jan 6, 2026 •

edited

Loading