Skip to content

[SPARK-51482][SQL] Support cast from string to time #50236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Mar 11, 2025

What changes were proposed in this pull request?

In the PR, I propose to support CAST from STRING to TIME(n). The format of input strings should match to:

[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us]

Why are the changes needed?

Before the changes, such casting allowed by the SQL standard failed w/ the error:

scala> Seq("17:18:19").toDS.select($"value".cast(TimeType())).show()
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.CAST_WITHOUT_SUGGESTION] Cannot resolve "CAST(value AS TIME(6))" due to data type mismatch: cannot cast "STRING" to "TIME(6)". SQLSTATE: 42K09;

Does this PR introduce any user-facing change?

Yes. After the changes, the cast above works as expected:

scala> Seq("17:18:19").toDS.select($"value".cast(TimeType())).show()
+--------+
|   value|
+--------+
|17:18:19|
+--------+

How was this patch tested?

By running the new tests:

$ build/sbt "test:testOnly *CastWithAnsiOffSuite"
$ build/sbt "test:testOnly *CastWithAnsiOnSuite"

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Mar 11, 2025
@MaxGekk MaxGekk changed the title [WIP][SQL] Support cast from string to time [WIP][SPARK-51482][SQL] Support cast from string to time Mar 12, 2025
@MaxGekk MaxGekk changed the title [WIP][SPARK-51482][SQL] Support cast from string to time [SPARK-51482][SQL] Support cast from string to time Mar 12, 2025
@MaxGekk MaxGekk marked this pull request as ready for review March 12, 2025 04:25
@MaxGekk
Copy link
Member Author

MaxGekk commented Mar 12, 2025

@yaooqinn @LuciferYang @dongjoon-hyun @HyukjinKwon Could you take a look at the PR when you have time, please.

} else {
code"""
scala.Option<Long> $longOpt =
org.apache.spark.sql.catalyst.util.DateTimeUtils.stringToTime($c);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also use $dateTimeUtilsCls.stringToTime($c); ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and changed other places too. Please, review the PR: #50251

}

case _ =>
(_, evPrim, evNull) => code"$evNull = true;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(_, evPrim, evNull) => code"$evNull = true;"
(_, _, evNull) => code"$evNull = true;"

Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@MaxGekk
Copy link
Member Author

MaxGekk commented Mar 12, 2025

Merging to master. Thank you, @LuciferYang @beliefer for review.

@MaxGekk MaxGekk closed this in baf4b59 Mar 12, 2025
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, late LGTM.

anoopj pushed a commit to anoopj/spark that referenced this pull request Mar 15, 2025
### What changes were proposed in this pull request?
In the PR, I propose to support `CAST` from `STRING` to `TIME(n)`. The format of input strings should match to:
```
[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us]
```

### Why are the changes needed?
Before the changes, such casting allowed by the SQL standard failed w/ the error:
```scala
scala> Seq("17:18:19").toDS.select($"value".cast(TimeType())).show()
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.CAST_WITHOUT_SUGGESTION] Cannot resolve "CAST(value AS TIME(6))" due to data type mismatch: cannot cast "STRING" to "TIME(6)". SQLSTATE: 42K09;
```

### Does this PR introduce _any_ user-facing change?
Yes. After the changes, the cast above works as expected:
```scala
scala> Seq("17:18:19").toDS.select($"value".cast(TimeType())).show()
+--------+
|   value|
+--------+
|17:18:19|
+--------+
```

### How was this patch tested?
By running the new tests:
```
$ build/sbt "test:testOnly *CastWithAnsiOffSuite"
$ build/sbt "test:testOnly *CastWithAnsiOnSuite"
```

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#50236 from MaxGekk/string-to-time.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants