Profiling/isolating performance issues #2824

eyelidlessness · 2021-09-06T01:12:53Z

eyelidlessness
Sep 6, 2021

Apologies for any formatting issues, typing this on mobile so I could get some time away from the desk.

I’m working on an ESM loader for esbuild (yes I know there are several, they’re all fairly incomplete; I decided building my own would be more productive than trying to contribute such a large proportion of functionality to existing projects).

I want to use this loader with Ava for fast on the fly compilation of TS/JSX (preferably without hitting the filesystem). This is working! And I intend to release a companion package setting that up for ease of use. But I’m trying to track down some unexpected performance issues.

Potentially relevant, I also want to:

Debug in VSCode
In watch mode
Across multiple test modules

As of Ava 3.x, attempting to do using the Ava CLI prevents that, even though it’s technically possible with VSCode debugger auto-attach. So… I did what any ~~sane developer~~ hacker would do and reverse-engineered calls into api/default reporter/watcher to get to a proof of concept (more on that below). This is also working!

(Aside: I understand from #1628 that these interfaces are considered internal/unstable which I’d assumed, but that there was some interest in stabilizing them for external use at some point in the future. Based on experience detailed below, I suspect they’re now stable enough that this might be worth revisiting.)

As I’ve tried to move from proof of concept toward something that can be released, I’ve been nagged by a fairly constant test run overhead in my own codebase. My project currently builds in under 1ms (thanks esbuild!). The fastest test run I’ve been able to achieve is just shy of 2 seconds. This isn’t a huge test suite (of course it will grow as I address the various cases I want to fill in that aren’t handled by other loaders).

At first I thought this was probably inefficiencies in my own implementation, so I’ve been optimizing everything I can.

Next I thought this might be the cost of forking child processes and a V8 cold start. So I went even further afield, I’m now using the latest alpha release with workerThreads: true. This did make a big difference! Roughly 2s with all of my other successful optimizations versus 3s.

(This is where I’m not sure my reverse engineering is fully or even partially necessary; I don’t know what restrictions on debug/watch have been relaxed with 4.0 alpha.)

Next I thought I could squeeze more perf by taking advantage of esbuild’s incremental builds, so I built a shared worker which handles compilation for all workers. This has no measurable impact, and in hindsight I should have known it would, as esbuild itself is so fast starting cold.

I’ve measured everything that happens in my own code, and can’t find where this fixed overhead could be. I’ve been growing more and more sure it’s not in my own code at all. To be sure, I removed my loader from the equation and ran tests from only plain JS modules. Same overhead.

So I started trying to track it down in Ava. And this is a long way into this introduction to get to the point but… it’s not clear how to profile with either worker threads or forked child processes. VSCode has profiling facilities that are fairly manual. Connecting with Chrome remote debugger is possible but can’t connect to forks, and appears to treat work done in threads as idle time.

So I’ve been trying to measure manually (hence all of the manual steps I’ve described above). And the most obvious (to me) entry point for measurement is:

Start timing before api.run
Measure each stateChange event

At first I thought this had yielded positive results: the vast majority of test run time occurred before the first event: { type: 'dependencies', ... } where the dependency is in node_modules.

This turns out to be a red herring. I’ve been able to eliminate this dependency resolution with a combination of bundle: true, write: false esbuild options, and the same first event overhead showed up on the new first event, with { type: 'stats', … }. Pretty much exactly the same amount of time before first event (though in more focused measurements stats events definitely have the highest perf impact overall, presumably because those are sent after each test run and tests themselves are the next bottleneck).

So I’m sitting there staring at… there’s some fixed cost to just running tests, and it’s happening somewhere between starting a test run and the first event from the first test run. This implicates the whole test runner cycle unless/until it can be narrowed down and I don’t know where to start profiling or how, without ingesting the entire Ava runtime into my brain.

And I’m happy to do that if it could help address a general problem. But I’m coming up for air and asking for guidance on how Ava maintainers/contributors trace these things. Right now I have “the entirety of the codebase” as my investigation scope, and “I can’t use tools because they don’t work” as my constraint.

I’d love to contribute anything I can to help here but I’d also really appreciate any guidance on how to measure across process/thread boundaries.

novemberborn · 2021-09-12T14:18:34Z

novemberborn
Sep 12, 2021
Maintainer

You're right there's not much in the way of tracing AVA's behavior. It could be interesting to add some log output using https://www.npmjs.com/package/debug — we have a bit of that in the watcher code.

Your issue could be within AVA, or perhaps in the file globbing, or perhaps in loading the code-under-test itself.

1 reply

eyelidlessness Sep 12, 2021
Author

Thanks! I’ll take a look. I’ve also been peppering my own console.time* and performance.now throughout, and still haven’t found the slowdown.

I think I’ve ruled out globbing (though I did need to take extra steps to exclude node_modules from the chokidar). And I’m almost certain I’ve ruled out loading my own code, as I still see a fixed delay without my ESM loader and only running one plain JS ESM test with no imports besides ava itself. It’s slightly faster, but that’s to be expected because it’s executing less code, and the speed up isn’t noticeable without measuring.

And for thoroughness, I also ran that same test with and without my ESM loader through the standard CLI with a simple config. I see the same delay there.

I’ll keep digging, I’d love to find the cause and offer a perf improvement if I can.

mhassan1 · 2022-07-15T11:22:56Z

mhassan1
Jul 15, 2022

@eyelidlessness Did you ever discover anything related to that "fixed cost?"

1 reply

eyelidlessness Jul 15, 2022
Author

@eyelidlessness Did you ever discover anything related to that "fixed cost?"

Quite a lot in fact! 😅 (It took me so deeply into yak shaving that I have a half-baked test framework of my own, now gathering dust as I've been focused on other things.)

TL;DR: cold start of worker threads + cold start of any --require/--experimental-loader + whatever other runtime startup.

Ava's per thread cold start alone is ~100ms for me, my particular setup around ~300ms. These times are of course not reported in diagnostics, as they're incurred before Ava starts measuring.

A pretty reasonable "good enough" solution would be to eat the cost on first startup, then spin up a pool of idle threads for subsequent runs in watch mode. I went quite a bit further than this in my yak shaving adventures, but I'm not sure how much value there is in the various other things I tried besides looking good on silly artificial benchmarks.

Of the other things I explored, probably the most promising from a real DX perspective was optimizing reporter I/O. Amusingly, when everything else is fast (threads are ready, keep postMessage payloads light which is fairly trivial), the most noticeable lag for my otherwise-fast tests was just reporting them to the terminal. Even so, this is such a silly micro-optimization that it's probably not worth the complexity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Profiling/isolating performance issues #2824

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Profiling/isolating performance issues #2824

Uh oh!

eyelidlessness Sep 6, 2021

Replies: 2 comments · 2 replies

Uh oh!

novemberborn Sep 12, 2021 Maintainer

Uh oh!

eyelidlessness Sep 12, 2021 Author

Uh oh!

mhassan1 Jul 15, 2022

Uh oh!

eyelidlessness Jul 15, 2022 Author

eyelidlessness
Sep 6, 2021

Replies: 2 comments 2 replies

novemberborn
Sep 12, 2021
Maintainer

eyelidlessness Sep 12, 2021
Author

mhassan1
Jul 15, 2022

eyelidlessness Jul 15, 2022
Author