Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semicolon handling at top level is not quite right #155

Open
DavisVaughan opened this issue Nov 19, 2024 · 1 comment
Open

Semicolon handling at top level is not quite right #155

DavisVaughan opened this issue Nov 19, 2024 · 1 comment

Comments

@DavisVaughan
Copy link
Member

DavisVaughan commented Nov 19, 2024

I think that at top level semicolons have to be preceded by an expression?

Interestingly, inside a { any number of semicolons are blindly consumed

It's possible that semicolons are only valid:

  • At top level, when preceded by an expression
  • Directly inside a { scope (but not nested inside one)

I also wonder if we should emit semicolons in the parse tree as real tokens. Otherwise it just looks like whitespace to downstream consumers.

# Top level `;` is a parse error when its all alone, but we happily consume it
parse(text = ";")
#> Error in parse(text = ";"): <text>:1:1: unexpected ';'
#> 1: ;
#>     ^
treesitter::text_parse(";", treesitter.r::language())
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> ;
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 1)])

# We eat them all, when we should error
treesitter::text_parse(";;;", treesitter.r::language())
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> ;;;
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 3)])

# This is fine at top level
parse(text = "1;")
#> expression(1)
treesitter::text_parse("1;", treesitter.r::language())
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> 1;
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 2)]
#>   (float [(0, 0), (0, 1)])
#> )

# Interestingly this works (and we parse that fine)
parse(text = "{ ; }")
#> expression({ ; })
parse(text = "{ ; ; }")
#> expression({ ; ; })
treesitter::text_parse("{ ; }", treesitter.r::language())
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> { ; }
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 5)]
#>   (braced_expression [(0, 0), (0, 5)]
#>     open: "{" [(0, 0), (0, 1)]
#>     close: "}" [(0, 4), (0, 5)]
#>   )
#> )
treesitter::text_parse("{ ; ; }", treesitter.r::language())
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> { ; ; }
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 7)]
#>   (braced_expression [(0, 0), (0, 7)]
#>     open: "{" [(0, 0), (0, 1)]
#>     close: "}" [(0, 6), (0, 7)]
#>   )
#> )

# Note that this doesn't work
parse(text = "x[;]")
#> Error in parse(text = "x[;]"): <text>:1:3: unexpected ';'
#> 1: x[;
#>       ^

# Nor does this, so its not like with newlines where newlines are consumed
# recursively within a `(` / `[` / `[[` scope
parse(text = "{ x[;] }")
#> Error in parse(text = "{ x[;] }"): <text>:1:5: unexpected ';'
#> 1: { x[;
#>         ^
@DavisVaughan
Copy link
Member Author

DavisVaughan commented Nov 19, 2024

Notably you can't do _;, that's a special pipe placeholder parse error
https://github.com/wch/r-source/blob/988774e05497bcf2cfac47bfbec59d551432e3fb/src/main/gram.y#L1755

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant