Improve `create_input` and `create_output` #115

kylewlacy · 2024-08-26T05:55:04Z

Resolves #111

This PR overhauls the implementation of create_input. Previously, it was implemented with a very straightforward recursive function, which basically walked a directory and created Artifacts for each entry. The trouble comes with packed executables, which include extra resources. In a large directory structure, multiple files can reference the same resources multiple times (e.g. the interpreter ld-linux.so, which will be referenced by every dynamic executable). Due to the recursive nature, this meant we ended up calling create_input repeatedly for each resource, even if it was already processed.

The solution I ended up with was to first build a graph from the directory structure, then to traverse the graph to build the artifacts. I originally just tried to memoize the recursive function, but found it really tricky to structure in a way where the memoization would actually help, so I found doing it in two passes using a graph to ultimately be easier.

Also as part of this PR, I did some refactoring around tests, added more benchmarks, added a new profiling Cargo profile, and also improved create_output (create_output should be faster now, but I think it may also need to be revisited).

The original cause of this issue was an experimental change I was making on the std package, which added lots of resources for lots of scripts. This PR makes the call of create_input finish in a relatively reasonable amount of time (~90 seconds), but create_output is still too slow (I let it run for 40 minutes, and it still hadn't finished). Even 90 seconds is pretty painful though, so even if the create_output side were also about as fast, I want to revisit the approach to the changes I made to cut back on how many resources get used.

One downside with the new implementation is that it's slower in simple cases (i.e. without any resources), but it's still much faster in complex cases. See below for performance numbers and details.

Performance

Here are the results of running cargo bench "test_input" on my machine, both with the original implementation and the new implementation.

New implementation

Original implementation

As you can see, several of the benchmarks are twice as slow for the new implementation. However, the benchmarks under bench_input_with_shared_resources are about twice as fast. It's this latter case that tests the edge case that this PR fixes, where create_input is called with a directory structure with lots of shared resources. I'm hoping that the slower benchmarks can be made to be faster over time, but I think fixing the extreme slowness caused by the edge case is worth the trade-off for the time being.

…nction

kylewlacy added 30 commits August 10, 2024 12:23

Replace brioche_test module with brioche_test_support crate

e231c2d

Rewrite benchmarks from Criterion to Divan

eb2e2a6

Add more test cases around Directory

e38a072

Add benchmark for Directory::insert

cfba5b2

Tweak blob saving to use a mut ref for the permit

26f3648

Add benchmark for saving blobs

19c4374

Add input benchmark

71df65d

Add profiling Cargo profile

df5c9ce

Allow reusing buffers when creating blobs

fda0f0d

Cache input paths to speed up create_input with resources

5b9e9b2

Update save_blob_from_file to use a blocking task for reading/hashing

daf6237

Refactor resource creation in create_input_inner with a separate fu…

7ba9cd5

…nction

Add new benchmark for resources with common ancestor

e076fde

Refactor create_input to use a graph for building artifacts

9babadc

Update create_input to avoid redundant graph traversals

f8bb0e9

Update create_input to avoid traversing redundant resources

ade367d

Update create_input to create blobs in parallel

b8deb41

Fix create_input removing resource files

6e247b6

Add more log messages to create_input

308a2d5

Remove unused functions in input module

618b404

Remove unused buffer param from create_input_inner

b863200

Update create_output to avoid trying to write resources multiple times

41f5244

Fix handling of broken symlinks in create_input

2bce49c

Improve symlink handling in create_input

7e14040

Fix create_input when a resource could not be found

b03241e

Update create_input to remove input path if remove_input is set

c5f75a6

Fix test failures from create_output changes

6790c37

Add extra logging to create_output

6b2e555

Improve how create_output skips existing outputs

29bdea4

Combine create_input_inner into create_input

f98a029

Add some comments to create_input

e8cf3d4

kylewlacy mentioned this pull request Aug 26, 2024

Fix slowness in create_input when processing lots of resources #111

Closed

kylewlacy merged commit 0b8cbd6 into main Aug 26, 2024
5 checks passed

kylewlacy deleted the improve-create-input-and-create-output branch August 26, 2024 08:06

kylewlacy mentioned this pull request Sep 6, 2024

Revert create_output memoization #120

Merged

kylewlacy mentioned this pull request Sep 15, 2024

Tweak symlinks in create_input #125

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `create_input` and `create_output` #115

Improve `create_input` and `create_output` #115

kylewlacy commented Aug 26, 2024

Improve create_input and create_output #115

Improve create_input and create_output #115

Conversation

kylewlacy commented Aug 26, 2024

Performance

Improve `create_input` and `create_output` #115

Improve `create_input` and `create_output` #115