Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TypedString preserve quote style #1679

Merged
merged 11 commits into from
Jan 31, 2025
Merged

Conversation

graup
Copy link
Contributor

@graup graup commented Jan 23, 2025

Fixes #1673. This is a breaking change.

TypedString contained a String without any knowledge of the used quote style. The parser used parse_literal_string to construct this, which doesn't support any quote styles other than single or double quotes. Namely, it doesn't support triple quotes from BigQuery, causing the issue reported in #1673. Additionally, it doesn't round-trip properly, always formatting its string using single quotes.

I think the most proper fix is to have TypedString contain a Value instead, similar to IntroducedString and others. This gives us immediate support for other quote styles and fixes the formatting to make it roundtrippable.

This is a breaking change but should be an easy fix in users' codebases, just (un)wrapping the value. Migration path:

  1. When constructing an AST node
Expr::TypedString {
    data_type: DataType::JSON,
--  value: r#"{"class" : {"students" : [{"name" : "Jane"}]}}"#.to_string()
++  value: Value::SingleQuotedString(
++      r#"{"class" : {"students" : [{"name" : "Jane"}]}}"#.to_string()
++  )
},],
  1. When using AST parser results
if let Expr::TypedString { data_type, value } = expr {
--  let string_value: String = value;
++  let string_value: String = value.into_string().unwrap();
}

For convenience, I have added a method into_string -> Option(String) to Value to get the underlying string value.

@graup
Copy link
Contributor Author

graup commented Jan 30, 2025

Thanks @iffyio, appreciate your suggestions. I've changed the method to into_string -> Option(String). Also updated the PR description.

@PrettyWood
Copy link

Thank you so much for working on that 🙏 We have PRQL/prql#5099 on prql side. Very much appreciated

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @graup!
cc @alamb

@iffyio iffyio changed the title Make TypedString contain Value instead of String to support and preserve other quote styles Make TypedString preserve quote style Jan 31, 2025
@iffyio iffyio merged commit 447142c into apache:main Jan 31, 2025
9 checks passed
@graup graup deleted the typed-string-fix branch January 31, 2025 10:50
@alamb
Copy link
Contributor

alamb commented Jan 31, 2025

Werd! The PR / code 🚂 keeps on running. Thanks again @iffyio for all you do to keep this repo moving forward

Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
Vedin added a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fails to parse TypedString expressions in BigQuery with triple quoted strings
5 participants