[CORE] Deduplicate fallback reason when merging Appendable tags#12049
Open
liujiayi771 wants to merge 2 commits intoapache:mainfrom
Open
[CORE] Deduplicate fallback reason when merging Appendable tags#12049liujiayi771 wants to merge 2 commits intoapache:mainfrom
liujiayi771 wants to merge 2 commits intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
Contributor
Author
|
Run Gluten Clickhouse CI on x86 |
37f9100 to
5dde763
Compare
|
Run Gluten Clickhouse CI on x86 |
5dde763 to
320d769
Compare
|
Run Gluten Clickhouse CI on x86 |
zhztheplayer
reviewed
May 8, 2026
Comment on lines
+96
to
+102
| if (l.reason == r.reason || l.reason.contains(r.reason)) { | ||
| l | ||
| } else if (r.reason.contains(l.reason)) { | ||
| r | ||
| } else { | ||
| FallbackTag.Appendable(s"${l.reason}; ${r.reason}") | ||
| } |
Member
There was a problem hiding this comment.
How about storing a hash set in FallbackTag.Appendable? Could be faster and preciser.
When AQE re-runs columnar rules across stages, GlutenFallbackReporter repeatedly calls FallbackTags.add on the same shared logicalLink with the same reason. The previous merge logic unconditionally concatenated the reasons, producing strings like "r; r; r; ..." that grew with every AQE iteration — especially noticeable when users manually fall back specific node types. Skip the concat when one reason already contains the other.
…ckSuite Prior tests in the suite run gluten-enabled queries that post GlutenPlanFallbackEvent. Spark's LiveListenerBus dispatches events asynchronously, so events queued by earlier tests can still be delivered to a listener registered afterwards, contaminating the events buffer of the next test. Add a withFallbackEventListener helper in GlutenSuiteUtils that drains the bus before registering the listener and removes it afterwards. The three event-listener tests in FallbackSuite share the same boilerplate and now go through this helper.
320d769 to
492e252
Compare
|
Run Gluten Clickhouse CI on x86 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
When AQE re-runs columnar rules across stages,
GlutenFallbackReporterrepeatedly callsFallbackTags.addon the same sharedlogicalLinkwith the same reason. The previous mergelogic in
FallbackTags.addunconditionally concatenatedAppendablereasons, producing stringslike
"r; r; r; ..."that grow with every AQE iteration. This is especially noticeable when usersmanually fall back specific node types and the same fixed reason is added many times.
The relevant comment at
GlutenFallbackReporter.scala:66-71already explains the trigger:This PR makes
FallbackTag.Appendablestore fallback reasons in aHashSet[String]and mergeAppendabletags with set union.reason()joins the set only when the reason text is reported,so duplicate reasons are removed precisely without relying on substring checks.
The reason order is not significant here; the important behavior is that the reported text contains
each distinct fallback reason once.
Drive-by: stabilize FallbackSuite
While testing, the existing test
no fallback event emitted for vanilla Spark execution with gluten disabled(added in #12027) turned out to be flaky. Spark'sLiveListenerBusdispatches eventsasynchronously, so
GlutenPlanFallbackEvents posted by the previous test in the suite can still bedelivered to the listener registered by the next test, contaminating its
eventsbuffer. The dedupchange subtly shifted CPU/scheduling timing and exposed it.
Fix is bundled in this PR since it's directly triggered by these changes:
GlutenSuiteUtils.withFallbackEventListenerthat drains the bus before registering thelistener and removes it afterwards.
FallbackSuite, which previously duplicated the same listenerboilerplate, now go through this helper.
Test plan
FallbackTagSuiteingluten-corecovering:addof the same reason does not grow the reason stringAppendablereasons are preservedcontains-based implementationand passes after switching
AppendabletoHashSet-based dedup.mvn -pl gluten-core -DwildcardSuites=org.apache.gluten.extension.columnar.FallbackTagSuite testFallbackSuitecontinues to pass with the new helper; the previously flaky test is now stableacross reruns.