[MINOR] Upgrade Spark 4.0 to 4.0.2#12180
Merged
Merged
Conversation
Bumps the spark-4.0 profile from 4.0.1 to 4.0.2 (patch release).
Changes:
- pom.xml / tools/gluten-it/pom.xml: spark.version
- .github/workflows/util/install-spark-resources.sh: download version
- .github/workflows/velox_backend_x86.yml: step names
- docs/get-started/{build-guide,getting-started}.md: supported versions
No shim code changes are required: 4.0.2 is a maintenance release with
no public API changes, and unlike 4.1.2 (SPARK-55337) it does not revert
any binary signatures that the spark40 shim depends on.
Notable upstream fixes that may affect Gluten behaviour (no Gluten code
change needed, but worth watching CI for plan-stability / metrics diffs):
- SPARK-54439 SPJ KeyGroupedPartitioning + join key size mismatch
- SPARK-53434 ColumnarRow#get should check isNullAt
- SPARK-54917 Upgrade ORC to 2.1.4
Generated-by: claude-opus-4.7
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
Member
|
@yaooqinn the KeyGroupedPartitioningSuite is re-written in Gluten tests: |
ebca2f5 to
a51b5cb
Compare
|
Run Gluten Clickhouse CI on x86 |
a51b5cb to
ebca2f5
Compare
|
Run Gluten Clickhouse CI on x86 |
ebca2f5 to
a51b5cb
Compare
|
Run Gluten Clickhouse CI on x86 |
… 4.0 Spark 4.0.2 picks up SPARK-54439 (apache/spark#53142), a correctness fix in KeyGroupedShuffleSpec.createPartitioning() with two new tests in KeyGroupedPartitioningSuite. The vanilla tests use the base collectShuffles helper which only matches ShuffleExchangeExec, so they fail under Gluten where the shuffle is a ColumnarShuffleExchangeExec. Rather than excluding them, port them as testGluten overrides (same pattern as the existing SPARK-41471 tests) so they reuse the columnar-aware collectShuffles helper and keep coverage of the correctness fix. Locally verified on Velox backend (Spark 4.0.2, Scala 2.13): both new tests pass (shuffles.size == 1 and checkAnswer), with no change to the set of pre-existing suite failures. Generated-by: Claude Opus 4.8
a51b5cb to
fbb4804
Compare
|
Run Gluten Clickhouse CI on x86 |
zhouyuan
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Bumps the
spark-4.0profile from 4.0.1 to 4.0.2 (patch release).Sibling of #12177 (Spark 4.1 → 4.1.2), kept as a separate PR per one-concern-per-PR.
Files touched (6 / +7 −7)
pom.xml/tools/gluten-it/pom.xml—spark.version.github/workflows/util/install-spark-resources.sh— download version.github/workflows/velox_backend_x86.yml— step namesdocs/get-started/{build-guide,getting-started}.md— supported versionsdelta.version(also4.0.1, but for Delta Lake) is intentionally not touched.Why no shim code change
4.0.2 is a maintenance release. Unlike 4.1.2 (#12177), it does not revert any binary signatures the spark40 shim depends on — SPARK-55337 (MemoryStream binary-compat reversion) only lives on the 4.1.x branch.
Upstream fixes worth watching CI for
No Gluten-side code change is needed for any of the above.
How was this patch tested?
Relying on the existing Spark 4.0 CI matrix (gluten-ut/spark40 + velox_backend_x86 4.0 jobs).
Was this patch authored or co-authored using generative AI tooling?
Yes
Generated-by: claude-opus-4.7