|
| 1 | +# Reducing Variance |
| 2 | + |
| 3 | +<a name="disabling-cpu-frequency-scaling" /> |
| 4 | + |
| 5 | +## Disabling CPU Frequency Scaling |
| 6 | + |
| 7 | +If you see this error: |
| 8 | + |
| 9 | +``` |
| 10 | +***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. |
| 11 | +``` |
| 12 | + |
| 13 | +you might want to disable the CPU frequency scaling while running the |
| 14 | +benchmark, as well as consider other ways to stabilize the performance of |
| 15 | +your system while benchmarking. |
| 16 | + |
| 17 | +See [Reducing Variance](reducing_variance.md) for more information. |
| 18 | + |
| 19 | +Exactly how to do this depends on the Linux distribution, |
| 20 | +desktop environment, and installed programs. Specific details are a moving |
| 21 | +target, so we will not attempt to exhaustively document them here. |
| 22 | + |
| 23 | +One simple option is to use the `cpupower` program to change the |
| 24 | +performance governor to "performance". This tool is maintained along with |
| 25 | +the Linux kernel and provided by your distribution. |
| 26 | + |
| 27 | +It must be run as root, like this: |
| 28 | + |
| 29 | +```bash |
| 30 | +sudo cpupower frequency-set --governor performance |
| 31 | +``` |
| 32 | + |
| 33 | +After this you can verify that all CPUs are using the performance governor |
| 34 | +by running this command: |
| 35 | + |
| 36 | +```bash |
| 37 | +cpupower frequency-info -o proc |
| 38 | +``` |
| 39 | + |
| 40 | +The benchmarks you subsequently run will have less variance. |
| 41 | + |
| 42 | +<a name="reducing-variance" /> |
| 43 | + |
| 44 | +## Reducing Variance in Benchmarks |
| 45 | + |
| 46 | +The Linux CPU frequency governor [discussed |
| 47 | +above](user_guide#disabling-cpu-frequency-scaling) is not the only source |
| 48 | +of noise in benchmarks. Some, but not all, of the sources of variance |
| 49 | +include: |
| 50 | + |
| 51 | +1. On multi-core machines not all CPUs/CPU cores/CPU threads run the same |
| 52 | + speed, so running a benchmark one time and then again may give a |
| 53 | + different result depending on which CPU it ran on. |
| 54 | +2. CPU scaling features that run on the CPU, like Intel's Turbo Boost and |
| 55 | + AMD Turbo Core and Precision Boost, can temporarily change the CPU |
| 56 | + frequency even when the using the "performance" governor on Linux. |
| 57 | +3. Context switching between CPUs, or scheduling competition on the CPU the |
| 58 | + benchmark is running on. |
| 59 | +4. Intel Hyperthreading or AMD SMT causing the same issue as above. |
| 60 | +5. Cache effects caused by code running on other CPUs. |
| 61 | +6. Non-uniform memory architectures (NUMA). |
| 62 | + |
| 63 | +These can cause variance in benchmarks results within a single run |
| 64 | +(`--benchmark_repetitions=N`) or across multiple runs of the benchmark |
| 65 | +program. |
| 66 | + |
| 67 | +Reducing sources of variance is OS and architecture dependent, which is one |
| 68 | +reason some companies maintain machines dedicated to performance testing. |
| 69 | + |
| 70 | +Some of the easier and and effective ways of reducing variance on a typical |
| 71 | +Linux workstation are: |
| 72 | + |
| 73 | +1. Use the performance governer as [discussed |
| 74 | +above](user_guide#disabling-cpu-frequency-scaling). |
| 75 | +1. Disable processor boosting by: |
| 76 | + ```sh |
| 77 | + echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost |
| 78 | + ``` |
| 79 | + See the Linux kernel's |
| 80 | + [boost.txt](https://www.kernel.org/doc/Documentation/cpu-freq/boost.txt) |
| 81 | + for more information. |
| 82 | +2. Set the benchmark program's task affinity to a fixed cpu. For example: |
| 83 | + ```sh |
| 84 | + taskset -c 0 ./mybenchmark |
| 85 | + ``` |
| 86 | +3. Disabling Hyperthreading/SMT. This can be done in the Bios or using the |
| 87 | + `/sys` file system (see the LLVM project's [Benchmarking |
| 88 | + tips](https://llvm.org/docs/Benchmarking.html)). |
| 89 | +4. Close other programs that do non-trivial things based on timers, such as |
| 90 | + your web browser, desktop environment, etc. |
| 91 | +5. Reduce the working set of your benchmark to fit within the L1 cache, but |
| 92 | + do be aware that this may lead you to optimize for an unrelistic |
| 93 | + situation. |
| 94 | + |
| 95 | +Further resources on this topic: |
| 96 | + |
| 97 | +1. The LLVM project's [Benchmarking |
| 98 | + tips](https://llvm.org/docs/Benchmarking.html). |
| 99 | +1. The Arch Wiki [Cpu frequency |
| 100 | +scaling](https://wiki.archlinux.org/title/CPU_frequency_scaling) page. |
0 commit comments