You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-53942][SS] Support changing stateless shuffle partitions upon restart of streaming query
### What changes were proposed in this pull request?
This PR proposes to support changing stateless shuffle partitions upon restart of streaming query.
We don't introduce a new config or se - users can simply do the below to change the number of shuffle partitions:
* stop the query
* change the value of `spark.sql.shuffle.partitions`
* restart the query to take effect
Note that state partitions are still fixed and be unchanged from this. That said, the value of `spark.sql.shuffle.partitions` for batch 0 will be the number of state partitions and does not change even if the value of the config has changed upon restart.
As an implementation detail, this PR adds a new "internal" SQL config `spark.sql.streaming.internal.stateStore.partitions` to distinguish stateless shuffle partitions vs stateful shuffle partitions. Unlike other internal configs where we still expect someone (admin?) to use them, this config is NOT meant to be an user facing one and no one should set this up directly. We add this config to implement trick for compatibility, nothing else. We don't support compatibility of this config and there's no promise the config to be available in future. This PR states this as a WARN in the config description.
That said, the value of the new config is expected to be inherited from `spark.sql.shuffle.partitions` assuming no one will set this up directly.
To support compatibility, we employ a trick into offset log - for stateful shuffle partitions, we refer it to `spark.sql.streaming.internal.stateStore.partitions` in session config, and we keep using `spark.sql.shuffle.partitions` in offset log. We handle rebinding between two configs to leave the persistent layer unchanged. This way we can support the query to be both upgraded/downgraded.
### Why are the changes needed?
Whenever there is need to change the parallelism of the processing e.g. input volume being changed over time, the size of static table changed over time, skew in stream-static join (though AQE may help resolving this a bit), the only official approach to deal with this was to discard checkpoint and start a new query, implying they have to do full backfill. (For workloads with FEB sink, advanced (and adventurous) users could change the config in their user function, but that's arguably a hack.) Having to discard checkpoint is a one of major pains to use Structured Streaming, and we want to address one of the known reasons.
### Does this PR introduce _any_ user-facing change?
Yes, user can change shuffle partitions for stateless operators upon restart, via changing the config `spark.sql.shuffle.partitions`.
### How was this patch tested?
New UTs.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#52645 from HeartSaVioR/WIP-change-stateless-shuffle-partitions-in-streaming-query.
Authored-by: Jungtaek Lim <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>
0 commit comments