Profiling/isolating performance issues #2824
eyelidlessness
started this conversation in
General
Replies: 2 comments 2 replies
-
You're right there's not much in the way of tracing AVA's behavior. It could be interesting to add some log output using https://www.npmjs.com/package/debug — we have a bit of that in the watcher code. Your issue could be within AVA, or perhaps in the file globbing, or perhaps in loading the code-under-test itself. |
Beta Was this translation helpful? Give feedback.
1 reply
-
@eyelidlessness Did you ever discover anything related to that "fixed cost?" |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Apologies for any formatting issues, typing this on mobile so I could get some time away from the desk.
I’m working on an ESM loader for esbuild (yes I know there are several, they’re all fairly incomplete; I decided building my own would be more productive than trying to contribute such a large proportion of functionality to existing projects).
I want to use this loader with Ava for fast on the fly compilation of TS/JSX (preferably without hitting the filesystem). This is working! And I intend to release a companion package setting that up for ease of use. But I’m trying to track down some unexpected performance issues.
Potentially relevant, I also want to:
As of Ava 3.x, attempting to do using the Ava CLI prevents that, even though it’s technically possible with VSCode debugger auto-attach. So… I did what any
sane developerhacker would do and reverse-engineered calls into api/default reporter/watcher to get to a proof of concept (more on that below). This is also working!(Aside: I understand from #1628 that these interfaces are considered internal/unstable which I’d assumed, but that there was some interest in stabilizing them for external use at some point in the future. Based on experience detailed below, I suspect they’re now stable enough that this might be worth revisiting.)
As I’ve tried to move from proof of concept toward something that can be released, I’ve been nagged by a fairly constant test run overhead in my own codebase. My project currently builds in under 1ms (thanks esbuild!). The fastest test run I’ve been able to achieve is just shy of 2 seconds. This isn’t a huge test suite (of course it will grow as I address the various cases I want to fill in that aren’t handled by other loaders).
At first I thought this was probably inefficiencies in my own implementation, so I’ve been optimizing everything I can.
Next I thought this might be the cost of forking child processes and a V8 cold start. So I went even further afield, I’m now using the latest alpha release with
workerThreads: true
. This did make a big difference! Roughly 2s with all of my other successful optimizations versus 3s.(This is where I’m not sure my reverse engineering is fully or even partially necessary; I don’t know what restrictions on debug/watch have been relaxed with 4.0 alpha.)
Next I thought I could squeeze more perf by taking advantage of esbuild’s incremental builds, so I built a shared worker which handles compilation for all workers. This has no measurable impact, and in hindsight I should have known it would, as esbuild itself is so fast starting cold.
I’ve measured everything that happens in my own code, and can’t find where this fixed overhead could be. I’ve been growing more and more sure it’s not in my own code at all. To be sure, I removed my loader from the equation and ran tests from only plain JS modules. Same overhead.
So I started trying to track it down in Ava. And this is a long way into this introduction to get to the point but… it’s not clear how to profile with either worker threads or forked child processes. VSCode has profiling facilities that are fairly manual. Connecting with Chrome remote debugger is possible but can’t connect to forks, and appears to treat work done in threads as idle time.
So I’ve been trying to measure manually (hence all of the manual steps I’ve described above). And the most obvious (to me) entry point for measurement is:
api.run
stateChange
eventAt first I thought this had yielded positive results: the vast majority of test run time occurred before the first event:
{ type: 'dependencies', ... }
where the dependency is in node_modules.This turns out to be a red herring. I’ve been able to eliminate this dependency resolution with a combination of
bundle: true, write: false
esbuild options, and the same first event overhead showed up on the new first event, with{ type: 'stats', … }
. Pretty much exactly the same amount of time before first event (though in more focused measurementsstats
events definitely have the highest perf impact overall, presumably because those are sent after each test run and tests themselves are the next bottleneck).So I’m sitting there staring at… there’s some fixed cost to just running tests, and it’s happening somewhere between starting a test run and the first event from the first test run. This implicates the whole test runner cycle unless/until it can be narrowed down and I don’t know where to start profiling or how, without ingesting the entire Ava runtime into my brain.
And I’m happy to do that if it could help address a general problem. But I’m coming up for air and asking for guidance on how Ava maintainers/contributors trace these things. Right now I have “the entirety of the codebase” as my investigation scope, and “I can’t use tools because they don’t work” as my constraint.
I’d love to contribute anything I can to help here but I’d also really appreciate any guidance on how to measure across process/thread boundaries.
Beta Was this translation helpful? Give feedback.
All reactions