Skip to content

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Sep 7, 2025

Which issue does this PR close?

Part of #2344

Rationale for this change

There are two motivations for this PR:

  1. We sometimes find bugs in Comet expressons. Recent examples are length and to_pretty_string. We need to give users a way to disable these expressions so that Comet falls back to Spark rather than causing errors or producing incorrect results.
  2. Similarly, if a user wants to enable an expression that is marked as incompatible, they currently have to enable spark.comet.expression.allowIncompatible which enables all incompatible expressions, not just the specific one that they want to enable.

What changes are included in this PR?

Look for new configs per expression, such as:

  • spark.comet.expression.ArrayUnion.enabled
  • spark.comet.expression.ArrayUnion.allowIncompatible

The expression name is the Spark expression class name, as documented in the supported expressions page.

Note that this mechanis will not work for the following expressions, because they have not yet been moved to the new serde framework:

    // TODO Literal
    // TODO SortOrder (?)
    // TODO PromotePrecision
    // TODO CheckOverflow
    // TODO KnownFloatingPointNormalized
    // TODO ScalarSubquery
    // TODO UnscaledValue
    // TODO MakeDecimal
    // TODO BloomFilterMightContain
    // TODO RegExpReplace
TryCast
ToPrettyString

How are these changes tested?

@andygrove andygrove changed the title feat: Allow config override per expression to allow incompatible expressions [WIP] feat: Allow config override per expression to allow incompatible expressions Sep 7, 2025
@andygrove andygrove marked this pull request as ready for review September 7, 2025 15:24
@andygrove andygrove marked this pull request as draft September 7, 2025 15:36
@andygrove andygrove changed the title feat: Allow config override per expression to allow incompatible expressions feat: Allow config override per expression to allow incompatible expressions [WIP] Sep 7, 2025
@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.66%. Comparing base (f09f8af) to head (e6e49df).
⚠️ Report is 494 commits behind head on main.

Files with missing lines Patch % Lines
...on/src/main/scala/org/apache/comet/CometConf.scala 12.50% 7 Missing ⚠️
.../scala/org/apache/comet/serde/QueryPlanSerde.scala 93.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2329      +/-   ##
============================================
+ Coverage     56.12%   57.66%   +1.53%     
- Complexity      976     1297     +321     
============================================
  Files           119      147      +28     
  Lines         11743    13523    +1780     
  Branches       2251     2390     +139     
============================================
+ Hits           6591     7798    +1207     
- Misses         4012     4457     +445     
- Partials       1140     1268     +128     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines -231 to -232
val COMET_EXEC_INITCAP_ENABLED: ConfigEntry[Boolean] =
createExecEnabledConfig("initCap", defaultValue = false)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InitCap is an expression, not an exec, so this was always confusing

@andygrove andygrove modified the milestone: 0.11.0 Sep 8, 2025
@andygrove andygrove changed the title feat: Allow config override per expression to allow incompatible expressions [WIP] feat: Add dynamic enabled and allowIncompat configs for all supported expressions Sep 11, 2025
@andygrove andygrove marked this pull request as ready for review September 11, 2025 16:38
@andygrove andygrove marked this pull request as draft September 11, 2025 16:48
Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines 668 to 676
def isExprEnabled(name: String, conf: SQLConf = SQLConf.get): Boolean = {
val key = getExprEnabledConfigKey(name)
// all expressions are enabled by default
if (conf.contains(key)) {
conf.getConfString(key) == "true"
} else {
true
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def isExprEnabled(name: String, conf: SQLConf = SQLConf.get): Boolean = {
val key = getExprEnabledConfigKey(name)
// all expressions are enabled by default
if (conf.contains(key)) {
conf.getConfString(key) == "true"
} else {
true
}
}
def isExprEnabled(name: String, conf: SQLConf = SQLConf.get): Boolean = {
val key = getExprEnabledConfigKey(name)
conf.getConfString(key, "true").toLowerCase match {
case "false" => false
case _ => true
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I incorporated the feedback.

Comment on lines 682 to 685
def isExprAllowIncompat(name: String, conf: SQLConf = SQLConf.get): Boolean = {
val key = getExprAllowIncompatConfigKey(name)
conf.contains(key) && conf.getConfString(key) == "true"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def isExprAllowIncompat(name: String, conf: SQLConf = SQLConf.get): Boolean = {
val key = getExprAllowIncompatConfigKey(name)
conf.contains(key) && conf.getConfString(key) == "true"
}
def isExprAllowIncompat(name: String, conf: SQLConf = SQLConf.get): Boolean = {
isExprEnabled(name)
}

?

@andygrove andygrove added this to the 0.10.0 milestone Sep 11, 2025
@andygrove
Copy link
Member Author

I am working on adding more tests

Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it did make me wonder if we could meaningfully turn off the Literal expression. There are some cases where we hard-code instantiating them to support other expressions. That one almost doesn't make sense to be allowed to turn off, since you'd probably be turning off Comet at that point.

@andygrove andygrove marked this pull request as ready for review September 11, 2025 21:48
@andygrove andygrove merged commit 79516b6 into apache:main Sep 11, 2025
94 checks passed
@andygrove andygrove deleted the allow-incompat-per-expr branch September 11, 2025 23:29
andygrove added a commit to andygrove/datafusion-comet that referenced this pull request Sep 11, 2025
andygrove added a commit that referenced this pull request Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants