-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Improved semantic highlighting performance for huge files #828
Conversation
The only case where this did not noticeably improve the overall performance is when the entire file is a single |
The approach of storing data in the fields of the |
Memory-usage wise, with a 2MB generated zig file (Vulkan bindings), the allocated |
Thanks so much @FalsePattern for your contribution. I would like to create the release 0.10.0 tomorrow, can we wait for 0.11.0 to merge your PR? |
I think we should keep like today with textmate + TEXT language. You can use SimpleLanguageUtils.isSupported(language) to manage that. |
Yep, this is not urgent, it can wait until 0.11.0 |
7e413f9
to
cdc1e80
Compare
Done, the latest push has an extra check so that simple languages skip the lazy array lookup-based highlighting and instead directly push to highlight infos to holder.add like before, and visit() returns instantly |
@FalsePattern I wonder if it is possible to add your Zig PsiFile in the LSP4IJ test and write tests with your Zig PsiFile? Do you think it could be doable? |
How do we do that without also pulling in the full parser from ZigBrains as a dependency? |
No my idea is just copy paste your parser inside lsp4ij. We have no custom psifile in our test and your plugins and language server that yiu use seems very advanced. We could also write other tests like completion based on your copied zig psifile. What do you think about that? |
I'm fine with that, I'll dual-license the psi lexer+parser code under EPLv2 and that way it can be used in lsp4ij freely. The "large zig file" i've been performance testing on is just an autogenerated vulkan bindings file generated from the vulkan registry (MIT / Apache 2.0 license) using vulkan-zig (MIT license) so it should be fine to include that file as a whole, and I can provide json dumps of the LSP message traces from the project i use that file in for the test cases. |
It is a super news! We could after that add another test with completion, codelens, etc with a real custom PsiFile.
Great! |
@ericdallo could you please test this PR witj your plugin |
Will do |
@angelozerr I tested and didn't notice any problems so far |
Thanks and do you see performance improvement when yiur PsiFile is big? |
I didn't notice perf issues, but mostly clojure files are small, it's rare to have clojure files bigger than 2k lines |
@ericdallo thanks for your feedback! @CppCXY could you please test this PR because I know you have a custom PsiFile and give us feedback (if you don't see any problem and with large file if it improves performance). Thanks! |
Thanks @CppCXY for your feedback. I think this PR avoid blocking the EDT with large file, but doesn't improve the speed of the renderer. Do you see some blocking issue without this PR? |
No, I don't implement SemanticTokensRange |
But I remember that in VS Code, after editing code it doesn't immediately send semanticTokenFull—there might be some debouncing—and the newly entered characters will first inherit the color of the character to the left. |
Ok I think it is an another issue (please create it). This PR seems avoiding freezing IDE. Do you see this problem without this PR with large file? |
Regardless of whether this patch is applied, I don't see any difference |
Disable inlay hints while you're testing this PR, they're also a lag source, I have a separate PR for those. |
this.lazyInfos = highlightSemanticTokens(file, null); | ||
this.holder = holder; | ||
} | ||
action.run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why action.run() is called at the end although before it was called at first?
|
||
public static HighlightInfo resolve(int start, int end, TextAttributesKey colorKey) { | ||
return HighlightInfo | ||
.newHighlightInfo(RAINBOW_ELEMENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why RAINBOW_ELEMENT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry it was like this.
You mean that just calling |
Great improvement. Thanks @FalsePattern ! |
It's because it was called for every single integer in the semantic highlighting payload, which for that file was approximately 600k times, and at such high call counts even low-overhead function calls start to add up. With the per-100 element check it reduces it by 2 orders of magnitude while still being more than plenty frequent enough to not cause a noticeable stall when a cancel is triggered. |
I encountered a consistent stutter (2-3 seconds long freezes) when working with huge generated files in zig. After analysing with a profiler, i've pinpointed the source to the
LSPSemanticTokensHighlightVisitor.highlightSemanticTokens
method, which blocks the EDT while it's adding every singleHighlightInfo
to the holder for the entire file, which is an expensive operation due to internal checks.Additionally, for very large files (>50k lines), the
ProgressManager.checkCanceled();
call insideSemanticTokensData.highlight
, along with theHighlightInfo
creations also started being a major contributor (>30% sampler time) to the total freeze duration.This pull request attempts to resolve these issues using the following steps:
HighlightInfo
insideSemanticTokensData.highlight
is replaced with aLazyHighlightInfo
record, which stores the bare minimum information required to create an actual HighlightInfo on-demand.semanticTokens.highlight
inside the visitor is no longer passedholder::add
directly, instead, the lazy highlight infos are stored in a lookup array.SemanticTokensData.highlight
method, theProgressManager.checkCanceled()
is only called once every 100 data elements, effectively nullifying its overhead while still not being that long of a delay between each check.