Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import java.util.Arrays;
import java.util.EnumSet;
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
Expand Down Expand Up @@ -1042,57 +1043,71 @@ public void commitTransaction(CommitTransactionRequest commitTransactionRequest)
new TransactionWorkspaceMetaStoreManager(diagnostics, metaStoreManager);
((IcebergCatalog) baseCatalog).setMetaStoreManager(transactionMetaStoreManager);

commitTransactionRequest.tableChanges().stream()
.forEach(
change -> {
Table table = baseCatalog.loadTable(change.identifier());
if (!(table instanceof BaseTable baseTable)) {
throw new IllegalStateException(
"Cannot wrap catalog that does not produce BaseTable");
}
if (isCreate(change)) {
throw new BadRequestException(
"Unsupported operation: commitTranaction with updateForStagedCreate: %s",
change);
}
// Group all changes by table identifier to handle them atomically
// This prevents conflicts when multiple changes target the same table entity
// LinkedHashMap preserves insertion order for deterministic processing
Map<TableIdentifier, List<UpdateTableRequest>> changesByTable = new LinkedHashMap<>();
for (UpdateTableRequest change : commitTransactionRequest.tableChanges()) {
if (isCreate(change)) {
throw new BadRequestException(
"Unsupported operation: commitTranaction with updateForStagedCreate: %s", change);
}
changesByTable.computeIfAbsent(change.identifier(), k -> new ArrayList<>()).add(change);
Copy link
Contributor

@flyrain flyrain Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check and throw in case the table exists already? It'd also be nice to have a test case.

Copy link
Contributor

@dimas-b dimas-b Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Iceberg REST Catalog spec allow more than one table update in one commitTransaction operation?

Copy link
Contributor

@flyrain flyrain Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec doesn't mention the uniqueness of the table identifier,

CommitTransactionRequest:
.

Within one CommitTableRequest, multiple updates are allowed,

, which should apply to the same snapshot.

With that, I think it's undefined behavior whether the same table identifier could duplicate. We could explicitly disable it in Polaris though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not disable this in Polaris unless the spec explicitly flagged it as a disallowed use case (which it did not).

In lieu of an explicit spec, applying multiple updates in sequence is a fairly straight-forward operation since each update is self-contained and will be validated against the current metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

}

TableOperations tableOps = baseTable.operations();
TableMetadata currentMetadata = tableOps.current();

// Validate requirements; any CommitFailedExceptions will fail the overall request
change.requirements().forEach(requirement -> requirement.validate(currentMetadata));

// Apply changes
TableMetadata.Builder metadataBuilder = TableMetadata.buildFrom(currentMetadata);
change.updates().stream()
.forEach(
singleUpdate -> {
// Note: If location-overlap checking is refactored to be atomic, we could
// support validation within a single multi-table transaction as well, but
// will need to update the TransactionWorkspaceMetaStoreManager to better
// expose the concept of being able to read uncommitted updates.
if (singleUpdate instanceof MetadataUpdate.SetLocation setLocation) {
if (!currentMetadata.location().equals(setLocation.location())
&& !realmConfig.getConfig(
FeatureConfiguration.ALLOW_NAMESPACE_LOCATION_OVERLAP)) {
throw new BadRequestException(
"Unsupported operation: commitTransaction containing SetLocation"
+ " for table '%s' and new location '%s'",
change.identifier(),
((MetadataUpdate.SetLocation) singleUpdate).location());
}
}

// Apply updates to builder
singleUpdate.applyTo(metadataBuilder);
});

// Commit into transaction workspace we swapped the baseCatalog to use
TableMetadata updatedMetadata = metadataBuilder.build();
if (!updatedMetadata.changes().isEmpty()) {
tableOps.commit(currentMetadata, updatedMetadata);
// Process each table's changes in order
changesByTable.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code logic looks correct to me. I think it's a worthy change to merge in its own right.

However, as for the issue discussed in #3352 (comment) , I think this fix is effective, but it's not apparent that it will work correctly.

The basic problem is that Persistence is called with multiple entity objects for the same ID. This means that the updateEntitiesPropertiesIfNotChanged call on line must contain at most one entity per ID. This may or may not be true depending on what the catalog forwards to transactionMetaStoreManager for each commit on line 1108.

I do believe that one tableOps.commit() will result in one entity update, so it may be sufficient to just add a comment about that around line 1113. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, logic also looks correct to me. +1 to adding a comment on the subtlety though that we're coalescing all the updates for a given table into a single Polaris entity update, which is a slightly different behavior than if the caller expected the various UpdateTableRequests in this commitTransaction to really behave as if they were each applied independently (but as if under a lock).

Note that this issue was a known limitation, and referenced in a TODO in TransactionWorkspaceMetaStoreManager:

// TODO: If we want to support the semantic of opening a transaction in which multiple

// TODO: If we want to support the semantic of opening a transaction in which multiple
// reads and writes occur on the same entities, where the reads are expected to see the writes
// within the transaction workspace that haven't actually been committed, we can augment this
// class by allowing these pendingUpdates to represent the latest state of the entity if we
// also increment entityVersion. We'd need to store both a "latest view" of all updated entities
// to serve reads within the same transaction while also storing the ordered list of
// pendingUpdates that ultimately need to be applied in order within the real MetaStoreManager.

The alternative "fix" described there that is more general but more complex and probably has pitfalls is to really queue up the sequential mutations per entity in that "uncommitted persistence layer".

The main implications would be that if we plug into the MetaStoreManager layer, we can intercept but inherit other relevant hooks, such as generating events, having entityVersion increments directly match actual update requests, etc.

But I'm in favor of this more targeted change-coalescing fix here for now. We could either update/remove the TODO in TransactionWorkspaceMetaStoreManager and/or leave a comment in the code here referencing the other approach so any future changes to the way we handle these can more easily sort through it.

(tableIdentifier, changes) -> {
Table table = baseCatalog.loadTable(tableIdentifier);
if (!(table instanceof BaseTable baseTable)) {
throw new IllegalStateException("Cannot wrap catalog that does not produce BaseTable");
}

TableOperations tableOps = baseTable.operations();
TableMetadata baseMetadata = tableOps.current();

// Apply each change sequentially: validate requirements against current state,
// then apply updates. This ensures conflicts are detected (e.g., if two changes
// both expect schema ID 0, the second will fail after the first increments it).
TableMetadata currentMetadata = baseMetadata;
for (UpdateTableRequest change : changes) {
// Validate requirements against the current metadata state
final TableMetadata metadataForValidation = currentMetadata;
change
.requirements()
.forEach(requirement -> requirement.validate(metadataForValidation));

// Apply this change's updates
TableMetadata.Builder metadataBuilder = TableMetadata.buildFrom(currentMetadata);
for (MetadataUpdate singleUpdate : change.updates()) {
// Note: If location-overlap checking is refactored to be atomic, we could
// support validation within a single multi-table transaction as well, but
// will need to update the TransactionWorkspaceMetaStoreManager to better
// expose the concept of being able to read uncommitted updates.
if (singleUpdate instanceof MetadataUpdate.SetLocation setLocation) {
if (!currentMetadata.location().equals(setLocation.location())
&& !realmConfig.getConfig(
FeatureConfiguration.ALLOW_NAMESPACE_LOCATION_OVERLAP)) {
throw new BadRequestException(
"Unsupported operation: commitTransaction containing SetLocation"
+ " for table '%s' and new location '%s'",
change.identifier(), ((MetadataUpdate.SetLocation) singleUpdate).location());
}
}
});

// Apply updates to builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a // TODO to refactor this to better share/reconcile with the update-application logic in CatalogHandlerUtils (I understand this divergence was already latent and not introduced by this PR, but as the update-application logic grows in complexity here it's going to start getting a lot worse).

singleUpdate.applyTo(metadataBuilder);
}

// Update currentMetadata to reflect this change for subsequent requirement validation
currentMetadata = metadataBuilder.build();
}

// Commit all accumulated changes for this table in a single atomic operation
if (!currentMetadata.changes().isEmpty()) {
tableOps.commit(baseMetadata, currentMetadata);
}
});

// Commit the collected updates in a single atomic operation
List<EntityWithPath> pendingUpdates = transactionMetaStoreManager.getPendingUpdates();
Expand Down