Skip to content

Conversation

jounathaen
Copy link
Member

I found major performance issues with the current talc allocator. This one replaces it with galloc.

Just like #1935, this PR is for now only intended for testing purposes. Finding the right allocator requires deeper investigations and benchmarking.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Benchmark Current: ec4f7e8 Previous: 40e0c6e Performance Ratio
startup_benchmark Build Time 136.18 s 136.14 s 1.00
startup_benchmark File Size 0.91 MB 0.90 MB 1.01
Startup Time - 1 core 0.93 s (±0.01 s) 0.94 s (±0.02 s) 1.00
Startup Time - 2 cores 0.93 s (±0.01 s) 0.92 s (±0.03 s) 1.01
Startup Time - 4 cores 0.95 s (±0.02 s) 0.96 s (±0.03 s) 0.99
multithreaded_benchmark Build Time 139.93 s 140.97 s 0.99
multithreaded_benchmark File Size 0.96 MB 1.01 MB 0.95
Multithreaded Pi Efficiency - 2 Threads 2.93 % (±14.05 %) 2.17 % (±10.40 %) 1.35
Multithreaded Pi Efficiency - 4 Threads 1.56 % (±7.50 %) 1.51 % (±7.26 %) 1.03
Multithreaded Pi Efficiency - 8 Threads 0.75 % (±3.58 %) 0.77 % (±3.68 %) 0.97
micro_benchmarks Build Time 173.85 s 171.42 s 1.01
micro_benchmarks File Size 0.96 MB 1.01 MB 0.95
Scheduling time - 1 thread 3.54 ticks (±16.97 ticks) 2.77 ticks (±13.29 ticks) 1.28
Scheduling time - 2 threads 1.80 ticks (±8.62 ticks) 1.75 ticks (±8.39 ticks) 1.03
Micro - Time for syscall (getpid) 0.21 ticks (±1.00 ticks) 0.12 ticks (±0.58 ticks) 1.72
Memcpy speed - (built_in) block size 4096 1795.98 MByte/s (±8620.69 MByte/s) 1816.86 MByte/s (±8720.93 MByte/s) 0.99
Memcpy speed - (built_in) block size 1048576 544.58 MByte/s (±2613.99 MByte/s) 745.11 MByte/s (±3576.55 MByte/s) 0.73
Memcpy speed - (built_in) block size 16777216 212.16 MByte/s (±1018.35 MByte/s) 219.45 MByte/s (±1053.36 MByte/s) 0.97
Memset speed - (built_in) block size 4096 1518.99 MByte/s (±7291.14 MByte/s) 1875.00 MByte/s (±9000.00 MByte/s) 0.81
Memset speed - (built_in) block size 1048576 1358.14 MByte/s (±6519.08 MByte/s) 1029.20 MByte/s (±4940.18 MByte/s) 1.32
Memset speed - (built_in) block size 16777216 901.81 MByte/s (±4328.69 MByte/s) 924.64 MByte/s (±4438.25 MByte/s) 0.98
Memcpy speed - (rust) block size 4096 1395.35 MByte/s (±6697.67 MByte/s) 1411.76 MByte/s (±6776.47 MByte/s) 0.99
Memcpy speed - (rust) block size 1048576 554.89 MByte/s (±2663.49 MByte/s) 693.90 MByte/s (±3330.73 MByte/s) 0.80
Memcpy speed - (rust) block size 16777216 209.75 MByte/s (±1006.78 MByte/s) 219.67 MByte/s (±1054.40 MByte/s) 0.95
Memset speed - (rust) block size 4096 1304.35 MByte/s (±6260.87 MByte/s) 1791.04 MByte/s (±8597.01 MByte/s) 0.73
Memset speed - (rust) block size 1048576 1332.53 MByte/s (±6396.16 MByte/s) 1105.00 MByte/s (±5304.01 MByte/s) 1.21
Memset speed - (rust) block size 16777216 889.82 MByte/s (±4271.14 MByte/s) 954.96 MByte/s (±4583.81 MByte/s) 0.93
alloc_benchmarks Build Time 160.26 s 157.27 s 1.02
alloc_benchmarks File Size 0.97 MB 0.97 MB 1.01
Allocations - Allocation success 2.00 % (±13.86 %) 2.00 % (±13.86 %) 1
Allocations - Deallocation success 1.39 % (±9.67 %) 1.40 % (±9.67 %) 1.00
Allocations - Pre-fail Allocations 2.00 % (±13.86 %) 2.00 % (±13.86 %) 1
Allocations - Average Allocation time 708.39 Ticks (±4908.88 Ticks) 262.62 Ticks (±1819.84 Ticks) 2.70
Allocations - Average Allocation time (no fail) 708.39 Ticks (±4908.88 Ticks) 262.62 Ticks (±1819.84 Ticks) 2.70
Allocations - Average Deallocation time 28.48 Ticks (±197.35 Ticks) 17.05 Ticks (±118.18 Ticks) 1.67
mutex_benchmark Build Time 157.81 s 159.65 s 0.99
mutex_benchmark File Size 0.97 MB 1.01 MB 0.95
Mutex Stress Test Average Time per Iteration - 1 Threads 0.30 ns (±2.08 ns) 0.36 ns (±2.49 ns) 0.83
Mutex Stress Test Average Time per Iteration - 2 Threads 0.34 ns (±2.36 ns) 0.38 ns (±2.63 ns) 0.89

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant