-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure github-linguist compatibility for the syntaxes #1659
base: master
Are you sure you want to change the base?
Conversation
Could you take a look at this? @mnxn |
@mnxn, there is a slight (non-blocking) issue with the linguist PR, I'd love to hear what you think on. github-linguist/linguist#7126 (comment) could we have a Edit: See this comment to how we can do it, it'll be zero maintenance. |
f97f4cc
to
675eae7
Compare
675eae7
to
1c979bf
Compare
@@ -279,7 +279,7 @@ | |||
}, | |||
{ | |||
"comment": "destructured semantic value capture", | |||
"begin": "(?<![[:word:]][[:space:]]*)\\(", | |||
"begin": "(?<!\\w)\\(", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is a correct fix.
It resolves
- Invalid regex in grammar: `source.ocaml.menhir` (in `syntaxes/menhir.json`)
contains a malformed regex (regex "`(?<![[:word:]][[:space:]]*)\(`":
lookbehind assertion is not fixed length (at offset 26))
caused by [[:space:]]*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, the regex would not match either a(b, c) = x
or a (b, c) = x
. But now the second string matches.
I don't know what the ideal solution is here, but at the very least, both of the strings above should not match.
Also, the regex should stick with [[:word:]]
instead of \\w
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it accepts this (?<![[:word:]])(?<![[:space:]])\\(
, can we do this instead? Rubular parses it fine, i can't seem to get it to make the vsix to try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @mnxn, sorry for the ping. but is there any way we can sort this out? This is the only remaining blocker afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like that regex works. I think it might be fundamentally impossible to achieve the same behavior while staying fixed length.
I'm okay with relaxing the constraint a little bit and recognizing only 0 or 1 spaces like the examples I posted above. This regex should accomplish that: (?<![[:word:]]|[[:word:]][[:space:]])\(
BTW: you can try the extension by opening the repo in VS Code and pressing the run button. No need to make a .vsix or even build the JS if you're just testing the syntaxes.
@@ -279,7 +279,7 @@ | |||
}, | |||
{ | |||
"comment": "destructured semantic value capture", | |||
"begin": "(?<![[:word:]][[:space:]]*)\\(", | |||
"begin": "(?<!\\w)\\(", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, the regex would not match either a(b, c) = x
or a (b, c) = x
. But now the second string matches.
I don't know what the ideal solution is here, but at the very least, both of the strings above should not match.
Also, the regex should stick with [[:word:]]
instead of \\w
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot wasn't able to review any files in this pull request.
Files not reviewed (9)
- package.json: Language not supported
- syntaxes/atd.json: Language not supported
- syntaxes/cram.json: Language not supported
- syntaxes/dune-all.json: Language not supported
- syntaxes/dune.json: Language not supported
- syntaxes/menhir.json: Language not supported
- syntaxes/ocaml.json: Language not supported
- syntaxes/ocamlbuild.json: Language not supported
- syntaxes/ocamllex.json: Language not supported
bit of a problem now that i figured to test it ;c. It takes priority depending on order, and whilst some stanzas do work, the first included stanza e.g dune would be the only one that is highlighted correctly. The rest only partially. I probably could differentiate dune-project and dune by looking for (lang ...) at the top of the file, not sure about dune-workspace however. And that'd be a less painful way to address it, although not the best. cc @mnxn |
Is there no way for linguist to use the filenames to determine which grammar is used? |
that's what I tried initially, unfortunately each language can only use one grammar. |
blocking github-linguist/linguist#7126
p.s ensured formatting using biome.