diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 477b9a707..c33d15f3c 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -85,6 +85,7 @@ - [Debugging LLVM](./codegen/debugging.md) - [Emitting Diagnostics](./diag.md) - [JSON diagnostic format](./diag/json-format.md) +- [Profile-guided Optimization](./profile-guided-optimization.md) --- diff --git a/src/profile-guided-optimization.md b/src/profile-guided-optimization.md new file mode 100644 index 000000000..fb897e901 --- /dev/null +++ b/src/profile-guided-optimization.md @@ -0,0 +1,132 @@ +# Profile Guided Optimization + +`rustc` supports doing profile-guided optimization (PGO). +This chapter describes what PGO is and how the support for it is +implemented in `rustc`. + +## What Is Profiled-Guided Optimization? + +The basic concept of PGO is to collect data about the typical execution of +a program (e.g. which branches it is likely to take) and then use this data +to inform optimizations such as inlining, machine-code layout, +register allocation, etc. + +There are different ways of collecting data about a program's execution. +One is to run the program inside a profiler (such as `perf`) and another +is to create an instrumented binary, that is, a binary that has data +collection built into it, and run that. +The latter usually provides more accurate data. + +## How is PGO implemented in `rustc`? + +`rustc` current PGO implementation relies entirely on LLVM. +LLVM actually [supports multiple forms][clang-pgo] of PGO: + +[clang-pgo]: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization + +- Sampling-based PGO where an external profiling tool like `perf` is used + to collect data about a program's execution. +- GCOV-based profiling, where code coverage infrastructure is used to collect + profiling information. +- Front-end based instrumentation, where the compiler front-end (e.g. Clang) + inserts instrumentation intrinsics into the LLVM IR it generates. +- IR-level instrumentation, where LLVM inserts the instrumentation intrinsics + itself during optimization passes. + +`rustc` supports only the last approach, IR-level instrumentation, mainly +because it is almost exclusively implemented in LLVM and needs little +maintenance on the Rust side. Fortunately, it is also the most modern approach, +yielding the best results. + +So, we are dealing with an instrumentation-based approach, i.e. profiling data +is generated by a specially instrumented version of the program that's being +optimized. Instrumentation-based PGO has two components: a compile-time +component and run-time component, and one needs to understand the overall +workflow to see how they interact. + +### Overall Workflow + +Generating a PGO-optimized program involves the following four steps: + +1. Compile the program with instrumentation enabled (e.g. `rustc -Cprofile-generate main.rs`) +2. Run the instrumented program (e.g. `./main`) which generates a `default-.profraw` file +3. Convert the `.profraw` file into a `.profdata` file using LLVM's `llvm-profdata` tool. +4. Compile the program again, this time making use of the profiling data + (e.g. `rustc -Cprofile-use=merged.profdata main.rs`) + +### Compile-Time Aspects + +Depending on which step in the above workflow we are in, two different things +can happen at compile time: + +#### Create Binaries with Instrumentation + +As mentioned above, the profiling instrumentation is added by LLVM. +`rustc` instructs LLVM to do so [by setting the appropriate][pgo-gen-passmanager] +flags when creating LLVM `PassManager`s: + +```C + // `PMBR` is an `LLVMPassManagerBuilderRef` + unwrap(PMBR)->EnablePGOInstrGen = true; + // Instrumented binaries have a default output path for the `.profraw` file + // hard-coded into them: + unwrap(PMBR)->PGOInstrGen = PGOGenPath; +``` + +`rustc` also has to make sure that some of the symbols from LLVM's profiling +runtime are not removed [by marking the with the right export level][pgo-gen-symbols]. + +[pgo-gen-passmanager]: https://github.com/rust-lang/rust/blob/1.34.1/src/rustllvm/PassWrapper.cpp#L412-L416 +[pgo-gen-symbols]:https://github.com/rust-lang/rust/blob/1.34.1/src/librustc_codegen_ssa/back/symbol_export.rs#L212-L225 + + +#### Compile Binaries Where Optimizations Make Use Of Profiling Data + +In the final step of the workflow described above, the program is compiled +again, with the compiler using the gathered profiling data in order to drive +optimization decisions. `rustc` again leaves most of the work to LLVM here, +basically [just telling][pgo-use-passmanager] the LLVM `PassManagerBuilder` +where the profiling data can be found: + +```C + unwrap(PMBR)->PGOInstrUse = PGOUsePath; +``` + +[pgo-use-passmanager]: https://github.com/rust-lang/rust/blob/1.34.1/src/rustllvm/PassWrapper.cpp#L417-L420 + +LLVM does the rest (e.g. setting branch weights, marking functions with +`cold` or `inlinehint`, etc). + + +### Runtime Aspects + +Instrumentation-based approaches always also have a runtime component, i.e. +once we have an instrumented program, that program needs to be run in order +to generate profiling data, and collecting and persisting this profiling +data needs some infrastructure in place. + +In the case of LLVM, these runtime components are implemented in +[compiler-rt][compiler-rt-profile] and statically linked into any instrumented +binaries. +The `rustc` version of this can be found in `src/libprofiler_builtins` which +basically packs the C code from `compiler-rt` into a Rust crate. + +In order for `libprofiler_builtins` to be built, `profiler = true` must be set +in `rustc`'s `config.toml`. + +[compiler-rt-profile]: https://github.com/llvm/llvm-project/tree/master/compiler-rt/lib/profile + +## Testing PGO + +Since the PGO workflow spans multiple compiler invocations most testing happens +in [run-make tests][rmake-tests] (the relevant tests have `pgo` in their name). +There is also a [codegen test][codegen-test] that checks that some expected +instrumentation artifacts show up in LLVM IR. + +[rmake-tests]: https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps +[codegen-test]: https://github.com/rust-lang/rust/blob/master/src/test/codegen/pgo-instrumentation.rs + +## Additional Information + +Clang's documentation contains a good overview on PGO in LLVM here: +https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization