Skip to content

[GLUTEN-12350][CORE] Place ConsistentHash virtual nodes by Node.key() instead of toString()#12351

Open
LuciferYang wants to merge 2 commits into
apache:mainfrom
LuciferYang:gluten-12350-consistent-hash-key
Open

[GLUTEN-12350][CORE] Place ConsistentHash virtual nodes by Node.key() instead of toString()#12351
LuciferYang wants to merge 2 commits into
apache:mainfrom
LuciferYang:gluten-12350-consistent-hash-key

Conversation

@LuciferYang

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

ConsistentHash.Partition.getPartitionKey() built its hash input from the node object directly:

return String.format("%s:%d", node, index);

%s resolves to node.toString(), so Node.key() was never used. This PR places virtual nodes on the ring by node.key(), adds a fail-fast check for a null key, and documents the key() contract.

Node.key() exists so the ring is keyed on a stable, logical id. Using toString() instead left key() as dead code, and the ring only happened to be correct because the one implementation (ExecutorNode) returns the same value from both methods. A future Node that implements only key() would be placed by Object.toString() (the identity hash), producing a different layout on every run and across JVMs — i.e. not consistent hashing at all. No user-facing change.

Fixes #12350.

How was this patch tested?

Added two unit tests in ConsistentHashTest: one asserts partitions are hashed by key() and never toString(), the other that a null key is rejected. The existing cases still pass.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

… instead of toString()

Partition.getPartitionKey() built the hash input from the node object, so
%s resolved to node.toString() and Node.key() was never used. It only
worked because the sole implementation, ExecutorNode, returns the same
value from both methods; a node overriding just key() would be placed on
the ring by its identity hash, giving an unstable distribution.

Hash by node.key(), reject a null key early, and document the key()
contract. Add tests for key()-based placement and null-key rejection.
Copilot AI review requested due to automatic review settings June 24, 2026 06:51
@github-actions github-actions Bot added the CORE works for Gluten Core label Jun 24, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@LuciferYang

Copy link
Copy Markdown
Contributor Author

cc @jackylee-ch

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes ConsistentHash virtual-node placement to hash partitions using Node.key() (stable logical identifier) rather than node.toString(), adds a fail-fast validation for null keys, and updates the Node.key() contract documentation to prevent unstable ring layouts across runs/JVMs.

Changes:

  • Hash partition keys using node.key() (and avoid String.format on the add path).
  • Add a precondition to reject nodes with null keys.
  • Add unit tests to ensure hashing is based on key() (not toString()) and that null keys are rejected.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
gluten-core/src/main/java/org/apache/gluten/hash/ConsistentHash.java Switch partition-key derivation to Node.key(), add null-key validation, and document the key() contract.
gluten-core/src/test/java/org/apache/gluten/hash/ConsistentHashTest.java Add tests asserting hashing uses key() and that null keys fail fast.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 201 to 205
private boolean add(T node) {
boolean added = false;
if (node != null && !nodes.containsKey(node)) {
Preconditions.checkArgument(node.key() != null, "Node key must not be null: %s", node);
Set<Partition<T>> partitions =

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in fc574dd. The null-key check now runs before nodes.containsKey(node), so a Node whose hashCode()/equals() reads key() fails fast with the IllegalArgumentException instead of NPEing inside the lookup.

Comment on lines +239 to +243
public String getPartitionKey() {
return String.format("%s:%d", node, index);
// Hash the node by its logical key() so the ring is stable and reproducible across JVMs.
// Avoid node.toString() (which may default to the identity hash) and String.format (slow on
// this hot path, called per virtual node during add()).
return node.key() + ":" + index;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fc574dd — the partition key is now computed once in the Partition constructor and stored in a final field, so key() isn't re-invoked on each collision-retry in add().

…cache partition key

- Move the null-key check ahead of nodes.containsKey() so a Node whose
  hashCode()/equals() reads key() fails fast instead of NPEing in the lookup.
- Cache the partition key in Partition's constructor; it is immutable and was
  recomputed on every collision-retry in add().
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ConsistentHash places virtual nodes by toString() instead of Node.key()

2 participants