Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Improved semantic highlighting performance for huge files #828

Merged
merged 1 commit into from
Feb 19, 2025

Conversation

FalsePattern
Copy link
Contributor

@FalsePattern FalsePattern commented Feb 10, 2025

I encountered a consistent stutter (2-3 seconds long freezes) when working with huge generated files in zig. After analysing with a profiler, i've pinpointed the source to the LSPSemanticTokensHighlightVisitor.highlightSemanticTokens method, which blocks the EDT while it's adding every single HighlightInfo to the holder for the entire file, which is an expensive operation due to internal checks.
Additionally, for very large files (>50k lines), the ProgressManager.checkCanceled(); call inside SemanticTokensData.highlight, along with the HighlightInfo creations also started being a major contributor (>30% sampler time) to the total freeze duration.

This pull request attempts to resolve these issues using the following steps:

  1. HighlightInfo inside SemanticTokensData.highlight is replaced with a LazyHighlightInfo record, which stores the bare minimum information required to create an actual HighlightInfo on-demand.
  2. semanticTokens.highlight inside the visitor is no longer passed holder::add directly, instead, the lazy highlight infos are stored in a lookup array.
  3. This lookup array is iterated over inside the visit() method for each leaf PSI element, which, for languages with existing PSI structures, prevents the IDE from freezing, because it internally pumps the EDT events after every few hundred PSI element.
  4. Because the highlight infos are no longer being resolved and added inside the SemanticTokensData.highlight method, the ProgressManager.checkCanceled() is only called once every 100 data elements, effectively nullifying its overhead while still not being that long of a delay between each check.

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 10, 2025

The only case where this did not noticeably improve the overall performance is when the entire file is a single LeafElement (plaintext files with no plugin providing a lexer+parser for them).

@FalsePattern
Copy link
Contributor Author

The approach of storing data in the fields of the HighlightVisitor instance at the start of analyze, calling action.run(), using those fields in visit() to read data "initialized" in the analyze() method, then cleaning up said fields in a try/finally block seems to be a common approach in a lot of other highlighters in intellij idea code.

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 10, 2025

Memory-usage wise, with a 2MB generated zig file (Vulkan bindings), the allocated lazyInfos array was 2 million elements (8MB, assuming 4 byte jvm pointers), and 144k elements of those arrays were populated, which, with a worst case scenario of 16 byte object headers, and the record itself containing 2 ints and an object, thus +12 bytes, total 28, rounded up to 32 bytes, would be about 4 MB on top of that.
In total that's about 12 MB for a 2 MB file, or about 6x as much memory used as the raw filesize while analysis is running, after which it gets immediately discarded.
At the absolute worst-case scenario with every element populated, it would be 72MB of used heap memory for a 2MB file (if every single character had a separate highlight info), so about 36x the memory usage as the raw file size, but a range of 4-8x is realistic for most languages.
This should not cause any problems, as intellij applies syntax highlighting to files one by one, and the highlighter cleans up the allocated arrays in a try/finally block, so a leak can never happen.

@angelozerr
Copy link
Contributor

Thanks so much @FalsePattern for your contribution.

I would like to create the release 0.10.0 tomorrow, can we wait for 0.11.0 to merge your PR?

@angelozerr
Copy link
Contributor

The only case where this did not noticeably improve the overall performance is when the entire file is a single LeafElement (plaintext files with no plugin providing a lexer+parser for them).

I think we should keep like today with textmate + TEXT language. You can use SimpleLanguageUtils.isSupported(language) to manage that.

@FalsePattern
Copy link
Contributor Author

Thanks so much @FalsePattern for your contribution.

I would like to create the release 0.10.0 tomorrow, can we wait for 0.11.0 to merge your PR?

Yep, this is not urgent, it can wait until 0.11.0

@FalsePattern FalsePattern force-pushed the main branch 2 times, most recently from 7e413f9 to cdc1e80 Compare February 10, 2025 20:47
@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 10, 2025

I think we should keep like today with textmate + TEXT language. You can use SimpleLanguageUtils.isSupported(language) to manage that.

Done, the latest push has an extra check so that simple languages skip the lazy array lookup-based highlighting and instead directly push to highlight infos to holder.add like before, and visit() returns instantly

@angelozerr
Copy link
Contributor

@FalsePattern I wonder if it is possible to add your Zig PsiFile in the LSP4IJ test and write tests with your Zig PsiFile? Do you think it could be doable?

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 11, 2025

@FalsePattern I wonder if it is possible to add your Zig PsiFile in the LSP4IJ test and write tests with your Zig PsiFile? Do you think it could be doable?

How do we do that without also pulling in the full parser from ZigBrains as a dependency?

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 11, 2025

Also, there is another major bottleneck i found, the LSPInlayHintsProvider.doCollect method blocks inside a read action, and in the case of that large file, the LSP itself takes about half a second to complete, which also contributes to the stutter:
image
This is a much easier fix though, by just making the inlay hint future itself "pending" unconditionally if it's not finished, and relying on the caching behaviour of the LSPInlayHintsSupport to provide the same future once the refreshEditorFeatureWhenAllDone triggers another inlay hint pass. This change, combined with this pull request, completely fixed the large file editing stutter for me.
image
I didn't add this change to the PR because it's a very dirty workaround.

Maybe we could add a boolean waitUntilDoneOrTimeout(future, psiFile, int timeoutMillis, List<CompletableFuture<?>> pendingFutures) method to CompletableFutures so that inlay hints that complete fast (<50ms) get done synchronously, but ones that take longer can take the async path?

@angelozerr
Copy link
Contributor

Also, there is another major bottleneck i found, the LSPInlayHintsProvider.doCollect method blocks inside a read action, and in the case of that large file, the LSP itself takes about half a second to complete, which also contributes to the stutter:
image
This is a much easier fix though, by just making the inlay hint future itself "pending" unconditionally if it's not finished, and relying on the caching behaviour of the LSPInlayHintsSupport to provide the same future once the refreshEditorFeatureWhenAllDone triggers another inlay hint pass. This change, combined with this pull request, completely fixed the large file editing stutter for me.
image
I didn't add this change to the PR because it's a very dirty workaround.

Maybe we could add a boolean waitUntilDoneOrTimeout(future, psiFile, int timeoutMillis, List<CompletableFuture<?>> pendingFutures) method to CompletableFutures so that inlay hints that complete fast (<50ms) get done synchronously, but ones that take longer can take the async path?

Indeed I have tried to avoid using timeout. Please create a new issue with the inlay hint topic.

@angelozerr
Copy link
Contributor

@FalsePattern I wonder if it is possible to add your Zig PsiFile in the LSP4IJ test and write tests with your Zig PsiFile? Do you think it could be doable?

How do we do that without also pulling in the full parser from ZigBrains as a dependency?

No my idea is just copy paste your parser inside lsp4ij.

We have no custom psifile in our test and your plugins and language server that yiu use seems very advanced.

We could also write other tests like completion based on your copied zig psifile.

What do you think about that?

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 11, 2025

Indeed I have tried to avoid using timeout. Please create a new issue with the inlay hint topic.

#835

No my idea is just copy paste your parser inside lsp4ij.

We have no custom psifile in our test and your plugins and language server that yiu use seems very advanced.

We could also write other tests like completion based on your copied zig psifile.

What do you think about that?

I'm fine with that, I'll dual-license the psi lexer+parser code under EPLv2 and that way it can be used in lsp4ij freely.

The "large zig file" i've been performance testing on is just an autogenerated vulkan bindings file generated from the vulkan registry (MIT / Apache 2.0 license) using vulkan-zig (MIT license) so it should be fine to include that file as a whole, and I can provide json dumps of the LSP message traces from the project i use that file in for the test cases.

@angelozerr
Copy link
Contributor

Indeed I have tried to avoid using timeout. Please create a new issue with the inlay hint topic.

#835

No my idea is just copy paste your parser inside lsp4ij.
We have no custom psifile in our test and your plugins and language server that yiu use seems very advanced.
We could also write other tests like completion based on your copied zig psifile.
What do you think about that?

I'm fine with that, I'll dual-license the psi lexer+parser code under EPLv2 and that way it can be used in lsp4ij freely.

It is a super news! We could after that add another test with completion, codelens, etc with a real custom PsiFile.

The "large zig file" i've been performance testing on is just an autogenerated vulkan bindings file generated from the vulkan registry (MIT / Apache 2.0 license) using vulkan-zig (MIT license) so it should be fine to include that file as a whole, and I can provide json dumps of the LSP message traces from the project i use that file in for the test cases.

Great!

@angelozerr
Copy link
Contributor

@ericdallo could you please test this PR witj your plugin

@ericdallo
Copy link
Contributor

Will do

@ericdallo
Copy link
Contributor

@angelozerr I tested and didn't notice any problems so far

@angelozerr
Copy link
Contributor

Thanks and do you see performance improvement when yiur PsiFile is big?

@ericdallo
Copy link
Contributor

I didn't notice perf issues, but mostly clojure files are small, it's rare to have clojure files bigger than 2k lines

@angelozerr
Copy link
Contributor

@ericdallo thanks for your feedback!

@CppCXY could you please test this PR because I know you have a custom PsiFile and give us feedback (if you don't see any problem and with large file if it improves performance). Thanks!

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

This is a version that enable the plugin with custom PSI and custom render and lsp4ij
performance1
This disables the plugin, enabling only lsp4ij:
performance2

It can be seen that the rendering is still slow. I think we might keep the previous rendering result until the new LSP result is returned and rendering is complete.

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

As a comparison, this is the performance of my language server in VS Code. As you can see, rendering is extremely fast, without the prolonged period of colorlessness seen in IntelliJ.
performance3

@angelozerr
Copy link
Contributor

Thanks @CppCXY for your feedback. I think this PR avoid blocking the EDT with large file, but doesn't improve the speed of the renderer.

Do you see some blocking issue without this PR?

@angelozerr
Copy link
Contributor

As a comparison, this is the performance of my language server in VS Code. As you can see, rendering is extremely fast, without the prolonged period of colorlessness seen in IntelliJ. performance3 performance3

I wonder if in vscode you consume SemanticTokensRange ? If yes it could explain the performance issue in IJ because LSP4IJ doesn't support it (it supports only SemanticTokensFull).

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

I wonder if in vscode you consume SemanticTokensRange ? If yes it could explain the performance issue in IJ because LSP4IJ doesn't support it (it supports only SemanticTokensFull).

No, I don't implement SemanticTokensRange

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

I wonder if in vscode you consume SemanticTokensRange ? If yes it could explain the performance issue in IJ because LSP4IJ doesn't support it (it supports only SemanticTokensFull).

No, I don't implement SemanticTokensRange

But I remember that in VS Code, after editing code it doesn't immediately send semanticTokenFull—there might be some debouncing—and the newly entered characters will first inherit the color of the character to the left.

@angelozerr
Copy link
Contributor

I wonder if in vscode you consume SemanticTokensRange ? If yes it could explain the performance issue in IJ because LSP4IJ doesn't support it (it supports only SemanticTokensFull).

No, I don't implement SemanticTokensRange

But I remember that in VS Code, after editing code it doesn't immediately send semanticTokenFull—there might be some debouncing—and the newly entered characters will first inherit the color of the character to the left.

Ok I think it is an another issue (please create it). This PR seems avoiding freezing IDE. Do you see this problem without this PR with large file?

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

I wonder if in vscode you consume SemanticTokensRange ? If yes it could explain the performance issue in IJ because LSP4IJ doesn't support it (it supports only SemanticTokensFull).

No, I don't implement SemanticTokensRange

But I remember that in VS Code, after editing code it doesn't immediately send semanticTokenFull—there might be some debouncing—and the newly entered characters will first inherit the color of the character to the left.

Ok I think it is an another issue (please create it). This PR seems avoiding freezing IDE. Do you see this problem without this PR with large file?

Regardless of whether this patch is applied, I don't see any difference

@CppCXY
Copy link
Contributor

CppCXY commented Feb 14, 2025

I observed a lag: when I continuously hold down the Enter key, the IDE freezes for a long time regardless of the version.
performance1

@FalsePattern
Copy link
Contributor Author

I observed a lag: when I continuously hold down the Enter key, the IDE freezes for a long time regardless of the version.

Disable inlay hints while you're testing this PR, they're also a lag source, I have a separate PR for those.

this.lazyInfos = highlightSemanticTokens(file, null);
this.holder = holder;
}
action.run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why action.run() is called at the end although before it was called at first?


public static HighlightInfo resolve(int start, int end, TextAttributesKey colorKey) {
return HighlightInfo
.newHighlightInfo(RAINBOW_ELEMENT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why RAINBOW_ELEMENT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry it was like this.

@angelozerr
Copy link
Contributor

Additionally, for very large files (>50k lines), the ProgressManager.checkCanceled(); call inside SemanticTokensData.highlight, along with the HighlightInfo creations also started being a major contributor (>30% sampler time) to the total freeze duration.

You mean that just calling ProgressManager.checkCanceled(); can be expensive?

@angelozerr angelozerr merged commit 8974e19 into redhat-developer:main Feb 19, 2025
6 checks passed
@angelozerr
Copy link
Contributor

Great improvement. Thanks @FalsePattern !

@FalsePattern
Copy link
Contributor Author

FalsePattern commented Feb 20, 2025

You mean that just calling ProgressManager.checkCanceled(); can be expensive?

It's because it was called for every single integer in the semantic highlighting payload, which for that file was approximately 600k times, and at such high call counts even low-overhead function calls start to add up. With the per-100 element check it reduces it by 2 orders of magnitude while still being more than plenty frequent enough to not cause a noticeable stall when a cancel is triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants