-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Parsing switch expression with when clause with closure #51482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is actually quite tricky, due to how we're handling a related grammar ambiguity. As explained in the spec, constructs like this are ambiguous: var x = switch (obj) {
_ when a + (b) => (c) => body
}; (because either The spec says that "When But this is not precisely what I implemented (oops). The rule I actually implemented was: a function expression cannot appear in a guard, unless it is inside grouping operators (parens, brackets, or curly braces). This means that a parenthesized group appearing in a guard is always parsed as a parenthesized expression or a record literal, and never as function literal arguments. This is my fault--I didn't realize at the time that there were user-visible differences between what I was doing and what was specified. Looking at it now, I think there are two user-visible differences:
Unfortunately, implementing precisely what was specified would be very challenging, because the parser is event-based, and so it is not easy for it to determine whether a sequence of tokens forms a valid expression without committing to parsing that sequence of tokens. (It's technically possible using the I'm considering a few alternatives:
Before I do any coding, I'll gather opinions from some folks about what to do. |
So the spec says:
I'm actually a little surprised I wrote it that way because "forms a valid expression" feels very brittle and annoying to parse to me. I recall avoiding a similar formulation to resolve another ambiguity. I definitely wouldn't want a slight tweak to a guard that causes the stuff before // OK, could be an expression but also happens to be a parameter list:
case _ when (a, b, c, {e, f, g}) => true => body;
// Oops, now it must be a parameter list:
case _ when (a, b, c, {e, int f, g}) => true => body;
I like that. My understanding of what you propose here is that there's basically two copies of the entire expression grammar. One allows function expressions and the other does not. All of the places in the grammar delimited by braces take you to the expression grammar that allows function expressions. The grammar rule for guards (and constructor initializers) take you to the one that doesn't. But within that function-less grammar, the rules for parenthesized expressions, argument lists, collection literals, etc. all hop you over back to the grammar where function expressions are allowed. That would mean these are all OK (syntactically): case _ when (() => true) => body; // Parenthesized.
case _ when [() => true] => body; // List literal.
case _ when {() => true} => body; // Set literal.
case _ when {k: () => true} => body; // Map literal.
case _ when foo(() => true) => body; // Argument list (including `assert`, `super`, method calls).
case _ when [() => true] => body; // List literal.
case _ when list[() => true] => body; // Index operator. But these are parse errors: case _ when () => true => body; // Bare.
case _ when !() => true => body; // Prefix operand.
case _ when a + () => true => body; // Infix operand.
case _ when a = () => true => body; // Assignment value.
case _ when c ? a : () => true => body; // Conditional operand.
case _ when c ? () => true : a => body; // Conditional operand. Do I have that right? If so, that sounds reasonable to me, though I worry about the complexity and how we specify it. ECMAScript has a whole thing around grammatical parameters and it would be nice to not have to go there.
We already know it's going to be parsing an expression. We don't know whether that expression will turn out to be a function expression or something else, but the thing after We already resolve that parsing difficulty somehow, because you run into it basically everywhere: var x = (a, b, c, d, e, f … When the parser is at Another approach I could see us taking is that we parse the guard expression with maximal munch: If it's possible for the I think that's more or less how we handle function expression bodies when they have infix or postfix operators following them: the body eats them all so |
I like the "only inside delimiters" rule too, like we do in initializer list expressions too. (So it's not a new idea, that's a good thing). |
Yes, that's a good description of what I implemented. Incidentally, as @lrhn mentioned above, I didn't invent this rule from scratch--I repurposed what the parser was already doing for constructor initializers. Curiously, I don't see anything in the spec prohibiting undelimited function expressions inside constructor initializers. I guess this is a piece of technical debt for the spec (@eernstg I'd be curious what the spec parser does 😃).
Yes, I belive so.
If you're worried about implementation complexity, you don't need to, because it's already implemented, and it's fairly simple (the parser just has a boolean that keeps track of whether we're in a context where function literals are allowed). If you're worried about specification complexity, then yeah, I see where you're coming from. But that horse has already fled the stables because we're already doing this for constructor initializers 😃.
Right. And the way the parser currently disambiguates that is to look at the token after the matching
Unfortunately I don't think that works, because we need this unambiguous case to parse correctly: |
We know (even if we occasionally forget): #11509 I think we could make our spec-work easier if we introduce parameterized grammars, say something like: <list>[element]{allowTrailingComma} ::=
element (',' element)* <trailingCommaOpt>{allowTrailingComma}
| <empty>
;
<trailingCommaOpt>{allowTrailingComma} ::=
',' -- if allowTrailingComma
| <empty>
;
<showClause> ::= 'show' <list>[<identifier>]{allowTrailingComma = false} ;
<hideClause> ::= 'hide' <list>[<identifier>]{allowTrailingComma = false} ;
<argumentList> ::= <list>[(<identifier> ':')? <expression>]{allowTrailingComma = true} where the Then we wouldn't have to have both And the we could have |
We now unconditionally prohibit function literals inside guards, regardless of whether their bodies take the form `=> expression` or a block. This matches what was implemented in the parser. See discussion in dart-lang/sdk#51482.
We now unconditionally prohibit function literals inside guards, regardless of whether their bodies take the form `=> expression` or a block. This matches what was implemented in the parser. See discussion in dart-lang/sdk#51482.
I'm not particularly worried about implementation complexity because it's pretty easy to make a parametric recursive descent parser by just, well, passing around parameters.
This one. :) Parametric grammars are sorely tempting but I can't help but feel like it leads to more syntactic complexity than we want. But in this case, we're already doing it, so I guess it's not harmful. Keeping guards consistent with constructor initializers sounds good to me. Any thoughts on how I should specify this in the proposal? My understanding is that the spec doesn't actually specify constructor initializers correctly right now. |
I've sent a PR: dart-lang/language#2947 |
…2947) We now unconditionally prohibit function literals inside guards, regardless of whether their bodies take the form `=> expression` or a block. This matches what was implemented in the parser. See discussion in dart-lang/sdk#51482.
As of dart-lang/language@46411cb, the specification has been updated to match what was implemented (function literals are now prohibited in guards unless enclosed in parentheses, square brackets, or curly braces). So now the behaviour of the implementation is technically spec compliant. I'm keeping this issue open because I would still like to improve the error reporting if I can, but I'm reducing the priority to P2 since it's no longer a correctness issue. |
@stereotype441 wrote:
The spec parser does not include the expression
: patternAssignment
| functionExpression // <----- This one derives `=>` function literals.
| throwExpression
| assignableExpression assignmentOperator expression
| conditionalExpression
| cascade
;
primary
: thisExpression
| SUPER unconditionalAssignableSelector
| SUPER argumentPart
| functionPrimary // <----- This one derives block function literals.
| literal
| identifier
| newExpression
| constObjectExpression
| constructorInvocation
| '(' expression ')'
| constructorTearoff
| switchExpression
; The fact that Here is the snippet of the spec parser's grammar that deals with initializer lists: initializerListEntry
: SUPER arguments
| SUPER '.' identifierOrNew arguments
| fieldInitializer
| assertion
;
fieldInitializer
: (THIS '.')? identifier '=' initializerExpression
;
initializerExpression
: conditionalExpression
| cascade
; (Actually, looking at this, I believe Interestingly, I haven't seen any failures based on the fact that I've used the patterns feature spec grammar rule exactly as stated: guardedPattern
: pattern (WHEN expression)?
; The reason for this is probably that ANTLR disambiguates the grammar using some techniques that we don't use in the Dart parser. Hence, I changed int get body => 2;
void main() {
var _ = switch (0) {
1 when (() => true) => body, // Parenthesized.
1 when [() => true] => body, // List literal.
1 when {() => true} => body, // Set literal.
1 when {k: () => true} => body, // Map literal.
1 when foo(() => true) => body, // Argument list.
1 when [() => true] => body, // List literal.
1 when list[() => true] => body, // Index operator.
// Agree on most error cases mentioned earlier in this issue:
// Error: 1 when () => true => body, // Bare.
// Error: 1 when !() => true => body, // Prefix operand.
// Error: 1 when a + () => true => body, // Infix operand.
// Error: 1 when a = () => true => body, // Assignment value.
// Error: 1 when c ? a : () => true => body, // Conditional operand.
1 when c ? () => true : a => body, // Conditional operand.
_ => body,
};
} |
Not a big loss. We're talking an unconditional throw or assignment here. I guess if you wanted to remember the most recent argument in a static variable, you can do |
Thanks to dart-lang/language@46411cb I believe this issue is now resolved; the code from the initial bug report is now disallowed by the spec, so it's expected that it won't parse successfully. |
This is excerpt from
language/patterns/guard_error_test
.This reports
The text was updated successfully, but these errors were encountered: