Skip to content

Conversation

@singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented Jan 6, 2026

Problem

. Groups ALL changes by table- even if they appear randomly in the input like [A1, B1, A2, C1, A3] → groups to {A:
[A1, A2, A3], B: [B1], C: [C1]}
2. For each table, processes changes sequentially :
- Validate R1 against base metadata → Apply U1 → update currentMetadata
- Validate R2 against updated metadata → Apply U2 → update currentMetadata
- Validate R3 against updated metadata → Apply U3 → update currentMetadata
3. Single commit per table prevents duplicate entity IDs in pendingUpdates

This ensures:

  • Each change's requirements validate against the evolved state
  • Updates are applied in order within each table
  • Only one commit per table (solves the original problem)
  • Conflicts detected early (e.g., second change expecting schema ID 0 fails when it's already 1)

related discussion: #3352 (comment)

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jan 6, 2026
@singhpk234 singhpk234 changed the title Fix: Group transaction changes by table in a transaction Fix: Group transaction changes by table Jan 6, 2026
@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/optimize-transaction branch from 3035e74 to 2380846 Compare January 6, 2026 02:50
@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/optimize-transaction branch from 2380846 to e0f05c0 Compare January 6, 2026 02:53
@dimas-b
Copy link
Contributor

dimas-b commented Jan 6, 2026

nit: Only one commit per table (solves the original problem) - I'd simply say Only one commit per table. This will end up in the git commit log, where the "original problem" might be hard to understand later on... It's also fine to give a summary of the problem instead of the "original" reference if you prefer.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix. Thanks, @singhpk234 !

Just a couple of minor comments :)

if (!updatedMetadata.changes().isEmpty()) {
tableOps.commit(currentMetadata, updatedMetadata);
// Process each table's changes in order
changesByTable.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code logic looks correct to me. I think it's a worthy change to merge in its own right.

However, as for the issue discussed in #3352 (comment) , I think this fix is effective, but it's not apparent that it will work correctly.

The basic problem is that Persistence is called with multiple entity objects for the same ID. This means that the updateEntitiesPropertiesIfNotChanged call on line must contain at most one entity per ID. This may or may not be true depending on what the catalog forwards to transactionMetaStoreManager for each commit on line 1108.

I do believe that one tableOps.commit() will result in one entity update, so it may be sufficient to just add a comment about that around line 1113. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, logic also looks correct to me. +1 to adding a comment on the subtlety though that we're coalescing all the updates for a given table into a single Polaris entity update, which is a slightly different behavior than if the caller expected the various UpdateTableRequests in this commitTransaction to really behave as if they were each applied independently (but as if under a lock).

Note that this issue was a known limitation, and referenced in a TODO in TransactionWorkspaceMetaStoreManager:

// TODO: If we want to support the semantic of opening a transaction in which multiple

// TODO: If we want to support the semantic of opening a transaction in which multiple
// reads and writes occur on the same entities, where the reads are expected to see the writes
// within the transaction workspace that haven't actually been committed, we can augment this
// class by allowing these pendingUpdates to represent the latest state of the entity if we
// also increment entityVersion. We'd need to store both a "latest view" of all updated entities
// to serve reads within the same transaction while also storing the ordered list of
// pendingUpdates that ultimately need to be applied in order within the real MetaStoreManager.

The alternative "fix" described there that is more general but more complex and probably has pitfalls is to really queue up the sequential mutations per entity in that "uncommitted persistence layer".

The main implications would be that if we plug into the MetaStoreManager layer, we can intercept but inherit other relevant hooks, such as generating events, having entityVersion increments directly match actual update requests, etc.

But I'm in favor of this more targeted change-coalescing fix here for now. We could either update/remove the TODO in TransactionWorkspaceMetaStoreManager and/or leave a comment in the code here referencing the other approach so any future changes to the way we handle these can more easily sort through it.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jan 6, 2026
throw new BadRequestException(
"Unsupported operation: commitTranaction with updateForStagedCreate: %s", change);
}
changesByTable.computeIfAbsent(change.identifier(), k -> new ArrayList<>()).add(change);
Copy link
Contributor

@flyrain flyrain Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check and throw in case the table exists already? It'd also be nice to have a test case.

Copy link
Contributor

@dimas-b dimas-b Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Iceberg REST Catalog spec allow more than one table update in one commitTransaction operation?

Copy link
Contributor

@flyrain flyrain Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec doesn't mention the uniqueness of the table identifier,

CommitTransactionRequest:
.

Within one CommitTableRequest, multiple updates are allowed,

, which should apply to the same snapshot.

With that, I think it's undefined behavior whether the same table identifier could duplicate. We could explicitly disable it in Polaris though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not disable this in Polaris unless the spec explicitly flagged it as a disallowed use case (which it did not).

In lieu of an explicit spec, applying multiple updates in sequence is a fairly straight-forward operation since each update is self-contained and will be validated against the current metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

if (!updatedMetadata.changes().isEmpty()) {
tableOps.commit(currentMetadata, updatedMetadata);
// Process each table's changes in order
changesByTable.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, logic also looks correct to me. +1 to adding a comment on the subtlety though that we're coalescing all the updates for a given table into a single Polaris entity update, which is a slightly different behavior than if the caller expected the various UpdateTableRequests in this commitTransaction to really behave as if they were each applied independently (but as if under a lock).

Note that this issue was a known limitation, and referenced in a TODO in TransactionWorkspaceMetaStoreManager:

// TODO: If we want to support the semantic of opening a transaction in which multiple

// TODO: If we want to support the semantic of opening a transaction in which multiple
// reads and writes occur on the same entities, where the reads are expected to see the writes
// within the transaction workspace that haven't actually been committed, we can augment this
// class by allowing these pendingUpdates to represent the latest state of the entity if we
// also increment entityVersion. We'd need to store both a "latest view" of all updated entities
// to serve reads within the same transaction while also storing the ordered list of
// pendingUpdates that ultimately need to be applied in order within the real MetaStoreManager.

The alternative "fix" described there that is more general but more complex and probably has pitfalls is to really queue up the sequential mutations per entity in that "uncommitted persistence layer".

The main implications would be that if we plug into the MetaStoreManager layer, we can intercept but inherit other relevant hooks, such as generating events, having entityVersion increments directly match actual update requests, etc.

But I'm in favor of this more targeted change-coalescing fix here for now. We could either update/remove the TODO in TransactionWorkspaceMetaStoreManager and/or leave a comment in the code here referencing the other approach so any future changes to the way we handle these can more easily sort through it.

}
});

// Apply updates to builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a // TODO to refactor this to better share/reconcile with the update-application logic in CatalogHandlerUtils (I understand this divergence was already latent and not introduced by this PR, but as the update-application logic grows in complexity here it's going to start getting a lot worse).

Copy link
Contributor

@adnanhemani adnanhemani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome change!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants