Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51548][SQL] Provides configuration to decide whether to copy objects before shuffle. #50318

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhengchenyu
Copy link
Contributor

What changes were proposed in this pull request?

Provides configuration to decide whether to copy objects before shuffle

Why are the changes needed?

For SparkSQL, We will decide whether to copy the object based on the type of shuffle writer. However, for some third-party shuffle writers, the object will be forced to be copied. However, sometimes it is unnecessary. For example, the shuffle writer in uniffle will directly serialize the record when processing each record, so it is also unnecessary.

Does this PR introduce any user-facing change?

Add new configuration spark.sql.needToCopyObjectBeforeShuffle to enable or disable copy object before shuffle if the ShuffleManager is not SortShuffleManager.

How was this patch tested?

No need to test.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant