-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Check walkdir performance under windows #9847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @Veykril , you use windows IIRC, you might want to take a look |
How would one even go about checking the i/o performance here? 🤔 |
|
I've duplicated below a comment I made on reddit. Unfortunately I can't give access to this particular machine, but I'm happy to run any tests or development code if that's at all helpful. In my experience, rust performance (and rust-analyzer performance to some extent) is predominantly impacted by disk IO speed, more than CPU speed. For example, the sandbox repo linked above: On a Windows machine with a fast local SSD and 8 cores, it takes 46 seconds to do a clean On a Linux machine with a very slow network file system and 12 cores, it takes 229 seconds to do a clean |
I've now realized that reading files sequentially is pretty bad architecturally, and that we might want to fix this regardless of win performance. There are cases where readling N files sequentially is very slow, while reading them in parallel is very fast. Consider a distributed file system, where the set of files is spread across a fleet of the machines. Sequential read means N round-trip, while a fully parallel read would be just one roundtrip. |
There's a jwalk crate for parallel directory iteration, but I'm not sure we can use it (because we need to support exclusions and we do our own recursion anyway). |
Tried to access severity here. I've run
Note that So, it is true that database loading phase is 3x-ish slower on Windows. It's unclear which part of that is "cargo on windows is slower" and which is our reading of all source files from disk. Next steps here:
This would be low priority for me personally, as I don't have a convenient windows dev setup. But folks running windows might want to look into this (link to the code in the top comment). cc @rylev 😉 Removing broken windows label, as it is at worst a moderate perf regression. |
I wouldn't read too much into the "Database loaded" time. I often see 2-3 seconds on Linux. |
Yeah, that's the first part: today it often includes "run
|
On a warm cache, I'd expect Windows to be closer to Linux. The (worse) problems are on a cold cache, network drive etc. I still suspect parallel file reading and directory iteration are worth it, but it's not a big problem in common cases. |
Someone on reddit reports slow startup time on winrows with HDD:
https://old.reddit.com/r/rust/comments/p0z0fa/rustanalyzer_changelog_89/h8dt110/
I don't think we should optimize for HDD, but it makes sense to make sure that we don't do something stupid on windows, as its file system is known to be exciting.
The main disk IO should happen in this function:
https://github.com/matklad/rust-analyzer/blob/5d5d5182c137df293e983d3c3d48869d1c21092c/crates/vfs-notify/src/lib.rs#L175-L213
curiously, we don't do any parallelism whatsoever o_O ?
The text was updated successfully, but these errors were encountered: