Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ANTLR with hand-rolled parser #917

Merged
merged 69 commits into from
Feb 12, 2025

Conversation

stackoverflow
Copy link
Contributor

@stackoverflow stackoverflow commented Jan 29, 2025

Fixes #906
Fixes #888
Fixes #927
Fixes #931
Fixes #932

Incompatible changes:

  • Object entries in the same line now require a semicolon (properties and elements are unaffected):
// not valid anymore
foo { ["1"] = 1 ["2"] = 2 }

// requires ;
foo { ["1"] = 1; ["2"] = 2 }

Reason: not requiring semicolons or newlines requires too much backtracking in the parser, degrading performance. Also both our IntelliJ plugin and the LSP don't allow this syntax and show an error.

Our current ANTLR parser is quite slow, and, depending on the codebase, can be the slowest part of running Pkl.

Some results from parsing all snippet tests in pkl-core showing ~100x performance improvement:
ANTLR elapsed: 7122ms (7.1s)
New parser elapsed: 78.73ms

This is still a draft, many tests are still failing and repl parsing is still using the old parser.


@Override
public ExpressionNode visitAmendsExpr(Amends expr) {
// parentExpr is always New, Amends or Parenthesized. The parser makes sure of it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this comment, but it would be useful to link through to the method where this behaviour is expected in the parser.

@sgammon sgammon mentioned this pull request Feb 3, 2025
@sgammon

This comment was marked as resolved.

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass!

Comment on lines 487 to 493
public Span getStartDelimiterSpan() {
return startDelimiterSpan;
}

public Span getEndDelimiterSpan() {
return endDelimiterSpan;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are ununsed. Remove?

Alternatively, maybe add @SuppressWarnings("unused")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning is gone for me after the refactor.


public Span getStartDelimiterSpan() {
return startDelimiterSpan;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused. Remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

import org.pkl.core.util.Nullable;

public final class ObjectBody implements Node {
private final List<Parameter> parameters;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be consistent and either get rid of org.pkl.core.parser.cst.ParameterList, or make this also a ParameterList.

If we choose to get rid of ParameterList, we should also get rid of TypeParameterList, and ArgumentList.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *List classes keep around the span of their delimiters which we use for error reporting.
This cannot be a ParameterList because this is not surrounded by parentheses.

import org.pkl.core.parser.antlr.PklParser

@Execution(ExecutionMode.CONCURRENT)
interface ParserComparisonTestInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create an interface? These methods can just go directly into ParserComparisonTest.

interface ParserComparisonTestInterface {
@Test
@Execution(ExecutionMode.CONCURRENT)
fun compareSnippetTests() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this would be a good candidate for @ParameterizedTest.

But, this is temporary code anyways, so, it's not a big deal.

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: we should also remove "Grammar Definition" from the language spec, because after this change, ANTLR is no longer the source of truth. But, we also don't really have another spec to point people at yet.

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!! This is going to be a massive improvement for users.

@stackoverflow stackoverflow merged commit b526902 into apple:main Feb 12, 2025
4 checks passed
@stackoverflow stackoverflow deleted the new-parser branch February 12, 2025 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants