[GLUTEN-12350][CORE] Place ConsistentHash virtual nodes by Node.key() instead of toString()#12351
[GLUTEN-12350][CORE] Place ConsistentHash virtual nodes by Node.key() instead of toString()#12351LuciferYang wants to merge 2 commits into
Conversation
… instead of toString() Partition.getPartitionKey() built the hash input from the node object, so %s resolved to node.toString() and Node.key() was never used. It only worked because the sole implementation, ExecutorNode, returns the same value from both methods; a node overriding just key() would be placed on the ring by its identity hash, giving an unstable distribution. Hash by node.key(), reject a null key early, and document the key() contract. Add tests for key()-based placement and null-key rejection.
|
Run Gluten Clickhouse CI on x86 |
|
cc @jackylee-ch |
There was a problem hiding this comment.
Pull request overview
This PR fixes ConsistentHash virtual-node placement to hash partitions using Node.key() (stable logical identifier) rather than node.toString(), adds a fail-fast validation for null keys, and updates the Node.key() contract documentation to prevent unstable ring layouts across runs/JVMs.
Changes:
- Hash partition keys using
node.key()(and avoidString.formaton the add path). - Add a precondition to reject nodes with
nullkeys. - Add unit tests to ensure hashing is based on
key()(nottoString()) and that null keys are rejected.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| gluten-core/src/main/java/org/apache/gluten/hash/ConsistentHash.java | Switch partition-key derivation to Node.key(), add null-key validation, and document the key() contract. |
| gluten-core/src/test/java/org/apache/gluten/hash/ConsistentHashTest.java | Add tests asserting hashing uses key() and that null keys fail fast. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private boolean add(T node) { | ||
| boolean added = false; | ||
| if (node != null && !nodes.containsKey(node)) { | ||
| Preconditions.checkArgument(node.key() != null, "Node key must not be null: %s", node); | ||
| Set<Partition<T>> partitions = |
There was a problem hiding this comment.
Good catch — fixed in fc574dd. The null-key check now runs before nodes.containsKey(node), so a Node whose hashCode()/equals() reads key() fails fast with the IllegalArgumentException instead of NPEing inside the lookup.
| public String getPartitionKey() { | ||
| return String.format("%s:%d", node, index); | ||
| // Hash the node by its logical key() so the ring is stable and reproducible across JVMs. | ||
| // Avoid node.toString() (which may default to the identity hash) and String.format (slow on | ||
| // this hot path, called per virtual node during add()). | ||
| return node.key() + ":" + index; |
There was a problem hiding this comment.
Done in fc574dd — the partition key is now computed once in the Partition constructor and stored in a final field, so key() isn't re-invoked on each collision-retry in add().
…cache partition key - Move the null-key check ahead of nodes.containsKey() so a Node whose hashCode()/equals() reads key() fails fast instead of NPEing in the lookup. - Cache the partition key in Partition's constructor; it is immutable and was recomputed on every collision-retry in add().
|
Run Gluten Clickhouse CI on x86 |
What changes are proposed in this pull request?
ConsistentHash.Partition.getPartitionKey()built its hash input from the node object directly:%sresolves tonode.toString(), soNode.key()was never used. This PR places virtual nodes on the ring bynode.key(), adds a fail-fast check for a null key, and documents thekey()contract.Node.key()exists so the ring is keyed on a stable, logical id. UsingtoString()instead leftkey()as dead code, and the ring only happened to be correct because the one implementation (ExecutorNode) returns the same value from both methods. A futureNodethat implements onlykey()would be placed byObject.toString()(the identity hash), producing a different layout on every run and across JVMs — i.e. not consistent hashing at all. No user-facing change.Fixes #12350.
How was this patch tested?
Added two unit tests in
ConsistentHashTest: one asserts partitions are hashed bykey()and nevertoString(), the other that a null key is rejected. The existing cases still pass.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)