[GLUTEN-10992][VL] Fix MatchError for KeyGroupedPartitioning in native shuffle by brijrajk · Pull Request #12335 · apache/gluten

brijrajk · 2026-06-22T19:05:40Z

What changes were proposed in this pull request?

When Spark 4.0's V2 bucketing shuffle (spark.sql.v2.bucketing.shuffle.enabled=true) is used in a join where only one side reports partitioning, Spark generates a ShuffleExchangeExec with KeyGroupedPartitioning as its output partitioning.

The default case _ => in VeloxSparkPlanExecApi.genColumnarShuffleExchange created a ColumnarShuffleExchangeExec for this node without validation. When the query executed, ExecUtil.genShuffleDependency crashed with a scala.MatchError because KeyGroupedPartitioning was missing from its exhaustive match.

Changes:

VeloxSparkPlanExecApi.genColumnarShuffleExchange: add an explicit case _: KeyGroupedPartitioning => before the default that adds a fallback tag and returns the vanilla ShuffleExchangeExec. This prevents a ColumnarShuffleExchangeExec from being created for an unsupported partitioning type.
ExecUtil.genShuffleDependency: add an explicit wildcard case other => that throws GlutenNotSupportException instead of the cryptic scala.MatchError, as a defensive guard for any future unknown partitioning types.

How was this patch tested?

The existing testGluten("SPARK-41471: shuffle one side: only one side reports partitioning") tests in GlutenKeyGroupedPartitioningSuite (both spark40 and spark41) reproduce the crash exactly — they set V2_BUCKETING_SHUFFLE_ENABLED=true with only one bucketed side, which triggers a ShuffleExchangeExec with KeyGroupedPartitioning output and then call checkAnswer. After this fix these tests pass without MatchError.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (https://claude.ai/code)

Related issue: #10992

…e shuffle When Spark 4.0's V2 bucketing shuffle (spark.sql.v2.bucketing.shuffle.enabled=true) is used in a join where only one side reports partitioning, Spark generates a ShuffleExchangeExec with KeyGroupedPartitioning as its output. The default case in VeloxSparkPlanExecApi.genColumnarShuffleExchange created a ColumnarShuffleExchangeExec for this node, which then crashed with a scala.MatchError in ExecUtil.genShuffleDependency because KeyGroupedPartitioning was not handled in the native partitioning match. Fix by adding an explicit KeyGroupedPartitioning case to genColumnarShuffleExchange that marks the shuffle for fallback to vanilla Spark. Also harden ExecUtil.genShuffleDependency with an explicit wildcard that throws GlutenNotSupportException instead of a cryptic MatchError for any future unknown partitioning types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions Bot added the VELOX label Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-10992][VL] Fix MatchError for KeyGroupedPartitioning in native shuffle#12335

[GLUTEN-10992][VL] Fix MatchError for KeyGroupedPartitioning in native shuffle#12335
brijrajk wants to merge 1 commit into
apache:mainfrom
brijrajk:fix/10992-keygrouped-partitioning-fallback

brijrajk commented Jun 22, 2026 •

edited by github-actions Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brijrajk commented Jun 22, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

brijrajk commented Jun 22, 2026 •

edited by github-actions Bot

Loading