-
Notifications
You must be signed in to change notification settings - Fork 628
Expand parse without semicolons #1949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4b4f5a1
7c17355
b83c245
53d7930
0026bb7
75cb98e
bbd32ed
4bde5b4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,7 +67,7 @@ impl Dialect for MsSqlDialect { | |
} | ||
|
||
fn supports_connect_by(&self) -> bool { | ||
true | ||
false | ||
} | ||
|
||
fn supports_eq_alias_assignment(&self) -> bool { | ||
|
@@ -123,6 +123,10 @@ impl Dialect for MsSqlDialect { | |
true | ||
} | ||
|
||
fn supports_statements_without_semicolon_delimiter(&self) -> bool { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤦 MS SQL!!!! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤷♂️🤣 |
||
true | ||
} | ||
|
||
/// See <https://learn.microsoft.com/en-us/sql/relational-databases/security/authentication-access/server-level-roles> | ||
fn get_reserved_grantees_types(&self) -> &[GranteesType] { | ||
&[GranteesType::Public] | ||
|
@@ -280,6 +284,9 @@ impl MsSqlDialect { | |
) -> Result<Vec<Statement>, ParserError> { | ||
let mut stmts = Vec::new(); | ||
loop { | ||
while let Token::SemiColon = parser.peek_token_ref().token { | ||
parser.advance_token(); | ||
} | ||
if let Token::EOF = parser.peek_token_ref().token { | ||
break; | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -266,6 +266,22 @@ impl ParserOptions { | |
self.unescape = unescape; | ||
self | ||
} | ||
|
||
/// Set if semicolon statement delimiters are required. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the usecase to have the same configuration information on the It seems there are a few places in the Parser that inconsistently check the options and/or the dialect flag. If we had only one way to specify the option, there would not be any room for inconsistencies Or maybe I misunderstand what is going on here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It's similar to the existing
It seems like testing parsing without semicolons would get a lot worse if it wasn't a dialect option |
||
/// | ||
/// If this option is `true`, the following SQL will not parse. If the option is `false`, the SQL will parse. | ||
/// | ||
/// ```sql | ||
/// SELECT 1 | ||
/// SELECT 2 | ||
/// ``` | ||
pub fn with_require_semicolon_stmt_delimiter( | ||
mut self, | ||
require_semicolon_stmt_delimiter: bool, | ||
) -> Self { | ||
self.require_semicolon_stmt_delimiter = require_semicolon_stmt_delimiter; | ||
self | ||
} | ||
} | ||
|
||
#[derive(Copy, Clone)] | ||
|
@@ -362,7 +378,11 @@ impl<'a> Parser<'a> { | |
state: ParserState::Normal, | ||
dialect, | ||
recursion_counter: RecursionCounter::new(DEFAULT_REMAINING_DEPTH), | ||
options: ParserOptions::new().with_trailing_commas(dialect.supports_trailing_commas()), | ||
options: ParserOptions::new() | ||
.with_trailing_commas(dialect.supports_trailing_commas()) | ||
.with_require_semicolon_stmt_delimiter( | ||
!dialect.supports_statements_without_semicolon_delimiter(), | ||
), | ||
} | ||
} | ||
|
||
|
@@ -484,13 +504,18 @@ impl<'a> Parser<'a> { | |
|
||
match self.peek_token().token { | ||
Token::EOF => break, | ||
|
||
// end of statement | ||
Token::Word(word) => { | ||
if expecting_statement_delimiter && word.keyword == Keyword::END { | ||
break; | ||
} | ||
} | ||
// don't expect a semicolon statement delimiter after a newline when not otherwise required | ||
Token::Whitespace(Whitespace::Newline) => { | ||
if !self.options.require_semicolon_stmt_delimiter { | ||
expecting_statement_delimiter = false; | ||
} | ||
} | ||
_ => {} | ||
} | ||
|
||
|
@@ -500,7 +525,7 @@ impl<'a> Parser<'a> { | |
|
||
let statement = self.parse_statement()?; | ||
stmts.push(statement); | ||
expecting_statement_delimiter = true; | ||
expecting_statement_delimiter = self.options.require_semicolon_stmt_delimiter; | ||
} | ||
Ok(stmts) | ||
} | ||
|
@@ -4541,6 +4566,14 @@ impl<'a> Parser<'a> { | |
return Ok(vec![]); | ||
} | ||
|
||
if end_token == Token::SemiColon && !self.options.require_semicolon_stmt_delimiter { | ||
if let Token::Word(ref kw) = self.peek_token().token { | ||
if kw.keyword != Keyword::NoKeyword { | ||
return Ok(vec![]); | ||
} | ||
} | ||
} | ||
|
||
if self.options.trailing_commas && self.peek_tokens() == [Token::Comma, end_token] { | ||
let _ = self.consume_token(&Token::Comma); | ||
return Ok(vec![]); | ||
|
@@ -4558,6 +4591,9 @@ impl<'a> Parser<'a> { | |
) -> Result<Vec<Statement>, ParserError> { | ||
let mut values = vec![]; | ||
loop { | ||
// ignore empty statements (between successive statement delimiters) | ||
while self.consume_token(&Token::SemiColon) {} | ||
|
||
match &self.peek_nth_token_ref(0).token { | ||
Token::EOF => break, | ||
Token::Word(w) => { | ||
|
@@ -4569,7 +4605,13 @@ impl<'a> Parser<'a> { | |
} | ||
|
||
values.push(self.parse_statement()?); | ||
self.expect_token(&Token::SemiColon)?; | ||
|
||
if self.options.require_semicolon_stmt_delimiter { | ||
self.expect_token(&Token::SemiColon)?; | ||
} | ||
|
||
// ignore empty statements (between successive statement delimiters) | ||
while self.consume_token(&Token::SemiColon) {} | ||
} | ||
Ok(values) | ||
} | ||
|
@@ -16464,7 +16506,28 @@ impl<'a> Parser<'a> { | |
|
||
/// Parse [Statement::Return] | ||
fn parse_return(&mut self) -> Result<Statement, ParserError> { | ||
match self.maybe_parse(|p| p.parse_expr())? { | ||
let rs = self.maybe_parse(|p| { | ||
let expr = p.parse_expr()?; | ||
|
||
match &expr { | ||
Expr::Value(_) | ||
| Expr::Function(_) | ||
| Expr::UnaryOp { .. } | ||
| Expr::BinaryOp { .. } | ||
| Expr::Case { .. } | ||
| Expr::Cast { .. } | ||
| Expr::Convert { .. } | ||
| Expr::Subquery(_) => Ok(expr), | ||
// todo: how to retstrict to variables? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems more like a semantic check -- this crate is focused on syntax parsing. Read more hre: https://github.com/apache/datafusion-sqlparser-rs?tab=readme-ov-file#syntax-vs-semantics I think the error is inappropriate in this case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What do you suggest?
The difficulty here is that without semicolon tokens the return statement is ambiguous. Consider the following perfectly valid T-SQL statement (which is also a test case example) which has several variations of
If you revert this change and run that test, you get this error:
The reason is that for the first I think I had said when introducing parse_return in a previous PR that we'd have to come back and tighten it up eventually, but I can't find that discussion 😐. |
||
Expr::Identifier(id) if id.value.starts_with('@') => Ok(expr), | ||
_ => parser_err!( | ||
"Non-returnable expression found following RETURN", | ||
p.peek_token().span.start | ||
), | ||
} | ||
})?; | ||
|
||
match rs { | ||
Some(expr) => Ok(Statement::Return(ReturnStatement { | ||
value: Some(ReturnStatementValue::Expr(expr)), | ||
})), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to give an example here?
Something like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍