Skip to content

Conversation

@singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented Jan 5, 2026

About the change

Add support of cockroach DB in relational-jdbc persistence

There was pr long ago :

seems like the author got busy, all pending was to testing it E2E with cockroach with all the integ tests along with making it work with schema evolution logic. also marking them as co-author for their contribution.
Happy to close it if they wanna resume their original pr

co-author : @sathesuraj

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed) -- will add later
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed) - will add later

sfc-gh-prsingh and others added 6 commits January 4, 2026 20:57
CockroachDB uses PostgreSQL wire protocol and is indistinguishable
from PostgreSQL at the JDBC driver level. Compatibility is ensured
through INT4 schema declarations and comprehensive test coverage.

All PostgreSQL tests run against both PostgreSQL and CockroachDB.
- Created 5 new integration test classes that mirror PostgreSQL tests
- CockroachApplicationIT, CockroachManagementServiceIT, CockroachPolicyServiceIT,
  CockroachRestCatalogIT, CockroachViewFileIT
- All tests extend the same base test classes as PostgreSQL tests
- Use CockroachRelationalJdbcProfile for CockroachDB container lifecycle
- Tests run with same logic as PostgreSQL to ensure compatibility
- Enabled parallel test execution (maxParallelForks = 2)
- PostgreSQL and CockroachDB tests can now run concurrently

This ensures comprehensive test coverage for CockroachDB backend.
CockroachDB Support:
- Added COCKROACHDB to DatabaseType enum (maps to "cockroachdb" directory)
- Created separate schema directory: cockroachdb/schema-v1.sql
- CockroachDB schema v1 based on PostgreSQL schema v3 (includes all tables)
- Uses INT4 explicitly for integer columns (required for CockroachDB JDBC driver)

Database Type Detection and Validation:
- Implemented DatabaseType.inferFromConnection() to detect database from JDBC metadata
- If database-type is configured: uses configured type and validates against connection
- If configured type mismatches connection: throws IllegalStateException with clear error
- If no configured type: auto-detects from connection (CockroachDB, PostgreSQL, H2)
- Validation ensures configuration matches actual database

Configuration:
- New property: polaris.persistence.relational.jdbc.database-type
- Supported values: "postgresql", "cockroachdb", "h2"
- If set, configuration is authoritative and validated against connection
- If not set, auto-detects from JDBC connection metadata

Bootstrap Support:
- Added DatabaseType.getLatestSchemaVersion() method
- PostgreSQL: latest version is 3
- CockroachDB: latest version is 1
- Updated JdbcBootstrapUtils to use database-specific latest version
- Bootstrap automatically selects correct schema for each database type

Test Configuration:
- CockroachDB test profiles explicitly set database-type=cockroachdb
- Ensures proper identification even with PostgreSQL JDBC driver
- Added 5 integration test classes for server module
- Enabled parallel test execution (PostgreSQL and CockroachDB tests run concurrently)

PostgreSQL Schema:
- Kept original INTEGER/INT types (no changes to existing schemas)
- Removed trailing newlines from schema files
Comment on lines +142 to +148
Set<Long> seenEntityIds = new HashSet<>();
for (PolarisBaseEntity entity : entities) {
if (!seenEntityIds.add(entity.getId())) {
throw new RetryOnConcurrencyException(
"Multiple updates to entity id '%s' in the same transaction", entity.getId());
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the difference and why its required:

PostgreSQL SERIALIZABLE:

  • Uses Serializable Snapshot Isolation (SSI)
  • When it detects a write-write conflict within the same transaction, it typically throws SQLSTATE 40001 (serialization_failure) immediately
  • It's designed to fail fast rather than retry

CockroachDB SERIALIZABLE:

  • Has automatic retry logic built-in at the database level
  • When it detects a conflict, it enters an internal retry loop hoping the conflict resolves
  • For conflicts within the same transaction (unresolvable), it can get stuck retrying

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why / when would this code be invoked with multiple entities having the same ID? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precisely this test got stuck : https://github.com/apache/polaris/blob/main/integration-tests/src/main/java/org/apache/polaris/service/it/test/PolarisRestCatalogIntegrationBase.java#L1088

i think it more about the implementation of how transaction api is implemented in Polaris

if you see we could collect all the changes for one table and then apply then one shot rather than making then two calls ?

Copy link
Contributor

@dimas-b dimas-b Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting ... 🤔

Are you saying that our impl. of commitTransaction() will cause multiple PolarisBaseEntity objects to be written via the Persistence interface, while some of those objects share the same ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precisely :) atleast thats my understanding is, we should group by table-uuid collect all the changes and then call commit for individual tables

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, that looks like a bug in the commitTransaction() impl. 😅

+1 to grouping changes by table (ID) and writing in one batch.... However, it might be best to make that fix before introducing CockroachDB... just for the sake of clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go : #3360

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet - let's merge #3360 , rebase and re-test whether this particular change is still necessary for CockroachDB, if you do not mind.

@jbonofre jbonofre self-requested a review January 5, 2026 14:06
@dimas-b
Copy link
Contributor

dimas-b commented Jan 5, 2026

co-author : @sathesuraj

Please use the standard Co-Authored-By tag in commits :)

https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to supporting CockroachDB and thanks for picking this work up, @singhpk234 !

Some comments below.

@Override
public Map<String, String> start() {
// CockroachDB testcontainers uses PostgreSQL wire protocol compatibility
cockroach = new CockroachContainer("cockroachdb/cockroach:v24.3.0");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use ContainerSpecHelper to enable managing docker image versions via Renovate.

Cf.

containerSpecHelper("postgres", PostgresRelationalJdbcLifeCycleManagement.class)

PRIMARY KEY (event_id)
);

-- INT4 type used directly in table definitions for CockroachDB JDBC compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably preferable to move this comment to the top of this file for visibility. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CockroachDB uses mostly the same Polaris code a PostgreSQL, I believe the schema versions are best kept in sync. I do not think we have to support old schemas in CockroachDB, but the current one should probably be v3 as our PostgreSQL schema... WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about the same though it make it a bit tricky in sense where we take schema as part of bootstrap, hence thought its better to keep it v1 thought the contents are of v3 sql (as it would just have v3 no v1 / v2).

I additionally also thought how about we do this

ALTER COLUMN FROM INT to INT4 to make it work with existing pg version scripts but it seems like we the feature to enable this

Caused by: java.sql.SQLException: Failed due to 'ERROR: ALTER COLUMN TYPE from int to int4 is only supported experimentally

  Hint: See: https://go.crdb.dev/issue-v/49329/v24.3

apparently

SET experimental_enable_alter_column_type_general = true;
ALTER TABLE your_table_name ALTER COLUMN your_column_name TYPE INT4;

setting this flag would be confusing.
Considering we are above i resorted to v1 name, please let me know your thoughts considering above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] a bit tricky in sense where we take schema as part of bootstrap,

Sorry, @singhpk234 , I do not quite understand this part 🤔 what exactly is "tricky" in case CockroachDB uses v3 for the schema version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all (apologies for not being clear), what i meant is what happens some one does bootstrap with v2 in this case, since we give schema version as arg for bootstrapping

* Determines the correct schema version to use for bootstrapping a realm.

may be we can abstract from the user in the handling ?

secondly can be confusing why there is no v1 / v2 unlike other persistence impl.

I believe when we add mariadb / mysql support we would have to work with an assumption that the column name are same and types (for example JSONB vs JSON) are handled accordingly so i just thought of starting fresh with v1 for cockroach which is exactly same as v3 of postgres schema.

Copy link
Contributor

@dimas-b dimas-b Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but what about this?

public static List<String> getAllColumnNames(int schemaVersion) {
if (schemaVersion < 2) {
return ALL_COLUMNS;
} else {
return ALL_COLUMNS_V2;
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, i missed it, let me think this more

Comment on lines +142 to +148
Set<Long> seenEntityIds = new HashSet<>();
for (PolarisBaseEntity entity : entities) {
if (!seenEntityIds.add(entity.getId())) {
throw new RetryOnConcurrencyException(
"Multiple updates to entity id '%s' in the same transaction", entity.getId());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why / when would this code be invoked with multiple entities having the same ID? 🤔

Set<Long> seenEntityIds = new HashSet<>();
for (PolarisBaseEntity entity : entities) {
if (!seenEntityIds.add(entity.getId())) {
throw new RetryOnConcurrencyException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a "retry" exception as opposed to an IllegalArgumentException? I believe the behaviour of this method with multiple objects having the same ID is not well-defined and such situations should be avoided on the caller side.

Comment on lines +52 to +56
public int getLatestSchemaVersion() {
return switch (this) {
case POSTGRES -> 3; // PostgreSQL has schemas v1, v2, v3
case COCKROACHDB -> 1; // CockroachDB currently has only schema v1
case H2 -> 3; // H2 uses same schemas as PostgreSQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment on the schema file: I believe all "latest" JDBC-based schemas should have the same version and evolve in sync. This is because from the caller side (Polaris JDBC persistence code), the "interface" of all JDBC schemas should be the same (column names and type compatibility).

Maintaining different column names and non-trivial type conversions separately for all JDBC backends is going to be too much overhead and will complicate Polaris code unnecessarily.

If the schemas can be assumed to be compatible from the client's (Polaris) perspective, using the same version number will only be natural. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think this through more, i went ahead with the current one with the following in the mind :
#3352 (comment)

"quarkus.datasource.active",
"true",
"quarkus.datasource.db-kind",
"postgresql");
Copy link
Contributor

@flyrain flyrain Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be cockroachdb? or quarkus.datasource.db-kind simplifies all postgresql-like ones as postgresql.

Suggested change
"postgresql");
"cockroachdb");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants