-
-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON example fails to lex unicode escapes like "unicode\u2028escape"
#458
Comments
"unicode\u2028escape"
"unicode\u2028escape"
Hi @kdy1, thanks for reporting this bug! Indeed, the regex used in the example is not perfect, and was observed to failed on some rare cases, that I excluded for simplicity. I am not sure whether this is possible to express something that matches all possible JSON strings as a regex, as it includes possibly very complex patterns, like escapes. If you are aware of such a regex, please let me know. Otherwise, using callback is probably the right way. |
I workarounded by creating another logos lexer and by using it within the callback. #[derive(Logos, Debug, Clone, Copy, PartialEq, Eq)]
enum StrContent {
#[regex(r#"\\["'\\bfnrtv]"#, priority = 100)]
#[regex(r#"\\0[0-7]*"#, priority = 100)]
#[regex(r#"\\x[0-9a-fA-F]{2}"#, priority = 100)]
#[regex(r#"\\u[0-9a-fA-F]{4}"#, priority = 100)]
#[regex(r#"\\[^'"\\]+"#)]
Escape,
#[regex(r#"[^'"\\]+"#)]
Normal,
#[regex(r#"'"#)]
SingleQuote,
#[regex(r#"""#)]
DoubleQuote,
} |
Nice! Do you have a complete example to share? |
I have one, but as it's still on the fork branch and still WIP, I'll post a comment with a link to the main branch after finishing the PR. |
Thanks! |
The problem in the example seems to be the lack of a regex group for a single escape sequence. By the way, pay attention to the escape sequence |
@pamburus The JSON example |
@jeertmans |
Nice @pamburus! If I remember correctly, my example failed to parse this: https://github.com/json-iterator/test-data/blob/master/large-file.json. If yours succeeds (or passes other examples that mine fails), please create a PR to change the regex ;-) |
@jeertmans |
While working on swc-project/swc#9807, I found that logos is failing to lex some string literals, and after some debugging, I found that the official example fails to lex Unicode escapes, even if it has
String
defined astest.json
:cargo run --example json-borrowed test.json
The graph
cat test.json | jq
works without any issueIs there a working syntax for escapes in string literals?
The text was updated successfully, but these errors were encountered: