Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with thread local pool of audio render quanta #511

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

orottier
Copy link
Owner

No description provided.

@orottier
Copy link
Owner Author

/bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2827917 (-11.09170%)
  L1 Accesses:              4275093 (-12.07793%)
  L2 Accesses:                  634 (-19.74684%)
  RAM Accesses:               54210 (-1.893007%)
  Estimated Cycles:         6175613 (-9.185848%)

bench_audio_buffer_decode
  Instructions:             7359267 (-0.027332%)
  L1 Accesses:             10164632 (+0.143862%)
  L2 Accesses:                28246 (+1.374583%)
  RAM Accesses:               31601 (-1.826711%)
  Estimated Cycles:        11411897 (-0.035591%)

bench_constant_source
  Instructions:             7285314 (-5.698699%)
  L1 Accesses:             10768798 (-7.193944%)
  L2 Accesses:                  776 (-16.37931%)
  RAM Accesses:               55808 (-0.975904%)
  Estimated Cycles:        12725958 (-6.293944%)

bench_sine
  Instructions:            60056057 (-0.892727%)
  L1 Accesses:             87826990 (-1.154348%)
  L2 Accesses:               208675 (-3.366598%)
  RAM Accesses:               54643 (-4.244283%)
  Estimated Cycles:        90782870 (-1.247464%)

bench_sine_gain
  Instructions:            63914209 (-1.724634%)
  L1 Accesses:             93606919 (-2.117027%)
  L2 Accesses:               212485 (-20.23178%)
  RAM Accesses:               56628 (+2.405150%)
  Estimated Cycles:        96651324 (-2.272484%)

bench_sine_gain_delay
  Instructions:            95927599 (-1.897279%)
  L1 Accesses:            137944146 (-2.411955%)
  L2 Accesses:               372142 (-15.87939%)
  RAM Accesses:               56874 (-0.819615%)
  Estimated Cycles:       141795446 (-2.594637%)

bench_buffer_src
  Instructions:            17488146 (-3.390438%)
  L1 Accesses:             25451894 (-4.477244%)
  L2 Accesses:                84493 (+1.412694%)
  RAM Accesses:               93232 (-0.604484%)
  Estimated Cycles:        29137479 (-3.977390%)

bench_buffer_src_delay
  Instructions:            73655526 (-2.052707%)
  L1 Accesses:            102505715 (-2.846340%)
  L2 Accesses:               266838 (-3.609087%)
  RAM Accesses:               93362 (-0.640671%)
  Estimated Cycles:       107107575 (-2.790087%)

bench_buffer_src_iir
  Instructions:            80999997 (-1.220262%)
  L1 Accesses:            114452932 (-1.649352%)
  L2 Accesses:                80621 (-5.433240%)
  RAM Accesses:               92839 (-0.609156%)
  Estimated Cycles:       118105402 (-1.634462%)

bench_buffer_src_biquad
  Instructions:            57192358 (-2.493995%)
  L1 Accesses:             77037560 (-3.519797%)
  L2 Accesses:               182601 (+3.099751%)
  RAM Accesses:               92985 (-0.607142%)
  Estimated Cycles:        81205040 (-3.336492%)

bench_stereo_positional
  Instructions:            42776488 (-6.025241%)
  L1 Accesses:             63636479 (-7.774956%)
  L2 Accesses:               427916 (+25.96064%)
  RAM Accesses:               93091 (-1.101691%)
  Estimated Cycles:        69034244 (-6.703408%)

bench_stereo_panning_automation
  Instructions:            28146477 (-3.793936%)
  L1 Accesses:             42034852 (-4.818928%)
  L2 Accesses:                90341 (-9.843820%)
  RAM Accesses:               93347 (-0.100598%)
  Estimated Cycles:        45753702 (-4.549531%)

bench_analyser_node
  Instructions:            35596158 (-4.196323%)
  L1 Accesses:             49196136 (-5.631479%)
  L2 Accesses:               130580 (+2.560478%)
  RAM Accesses:               94292 (-0.099591%)
  Estimated Cycles:        53149256 (-5.212560%)

bench_hrtf_panners
  Instructions:           320284636 (-0.121592%)
  L1 Accesses:            561659237 (-0.125074%)
  L2 Accesses:             10841676 (-0.424856%)
  RAM Accesses:              120159 (-0.558618%)
  Estimated Cycles:       620073182 (-0.154306%)

bench_sine_gain_with_worklet
  Instructions:            65101992 (-1.201544%)
  L1 Accesses:             95387845 (-1.568779%)
  L2 Accesses:               218248 (-2.324103%)
  RAM Accesses:               54932 (-0.984174%)
  Estimated Cycles:        98401705 (-1.565865%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


@orottier orottier force-pushed the feature/alloc-thread-local branch from a1f0270 to d915944 Compare May 16, 2024 17:35
@orottier
Copy link
Owner Author

/bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2698601 (-15.15733%)
  L1 Accesses:              4043680 (-16.83723%)
  L2 Accesses:                  605 (-23.22335%)
  RAM Accesses:               54658 (-1.082235%)
  Estimated Cycles:         5959735 (-12.36029%)

bench_audio_buffer_decode
  Instructions:             7359263 (-0.027061%)
  L1 Accesses:             10164259 (+0.141430%)
  L2 Accesses:                28626 (+2.396623%)
  RAM Accesses:               31588 (-1.888433%)
  Estimated Cycles:        11412969 (-0.031314%)

bench_constant_source
  Instructions:             6668424 (-13.68374%)
  L1 Accesses:              9683207 (-16.54964%)
  L2 Accesses:                  778 (-15.98272%)
  RAM Accesses:               54109 (-3.990560%)
  Estimated Cycles:        11580912 (-14.72530%)

bench_sine
  Instructions:            59424551 (-1.934867%)
  L1 Accesses:             86676187 (-2.448279%)
  L2 Accesses:               209767 (-3.373255%)
  RAM Accesses:               54621 (-4.272770%)
  Estimated Cycles:        89636757 (-2.498833%)

bench_sine_gain
  Instructions:            62855330 (-3.352781%)
  L1 Accesses:             91683998 (-4.125564%)
  L2 Accesses:               215166 (-19.89412%)
  RAM Accesses:               54731 (-1.016404%)
  Estimated Cycles:        94675413 (-4.278842%)

bench_sine_gain_delay
  Instructions:            98691751 (+0.929549%)
  L1 Accesses:            139662333 (-1.203078%)
  L2 Accesses:               365796 (-15.49831%)
  RAM Accesses:               56860 (-0.835383%)
  Estimated Cycles:       143481413 (-1.410610%)

bench_buffer_src
  Instructions:            16682378 (-7.841631%)
  L1 Accesses:             24005116 (-9.905579%)
  L2 Accesses:                79499 (-5.074687%)
  RAM Accesses:               93696 (-0.107679%)
  Estimated Cycles:        27681971 (-8.778962%)

bench_buffer_src_delay
  Instructions:            81181635 (+7.955548%)
  L1 Accesses:            109492742 (+3.773484%)
  L2 Accesses:               254693 (-7.176101%)
  RAM Accesses:               93400 (-0.599172%)
  Estimated Cycles:       114035207 (+3.506608%)

bench_buffer_src_iir
  Instructions:            79580761 (-2.951038%)
  L1 Accesses:            111954785 (-3.795736%)
  L2 Accesses:                86285 (+0.766096%)
  RAM Accesses:               92839 (-0.610220%)
  Estimated Cycles:       115635575 (-3.692732%)

bench_buffer_src_biquad
  Instructions:            55531167 (-5.326220%)
  L1 Accesses:             74018765 (-7.255639%)
  L2 Accesses:               166285 (-22.93629%)
  RAM Accesses:               92982 (-0.608224%)
  Estimated Cycles:        78104560 (-7.198037%)

bench_stereo_positional
  Instructions:            40338933 (-11.38029%)
  L1 Accesses:             58888047 (-14.55776%)
  L2 Accesses:               460434 (+9.739757%)
  RAM Accesses:               93086 (-1.108054%)
  Estimated Cycles:        64448227 (-13.27559%)

bench_stereo_panning_automation
  Instructions:            26989470 (-7.748568%)
  L1 Accesses:             39783687 (-9.945031%)
  L2 Accesses:                84741 (-1.578397%)
  RAM Accesses:               93348 (-0.100597%)
  Estimated Cycles:        43474572 (-9.197346%)

bench_analyser_node
  Instructions:            34049854 (-8.358142%)
  L1 Accesses:             46474700 (-10.85066%)
  L2 Accesses:               127615 (-0.301560%)
  RAM Accesses:               94284 (-0.110183%)
  Estimated Cycles:        50412715 (-10.09750%)

bench_hrtf_panners
  Instructions:           319927040 (-0.233104%)
  L1 Accesses:            560954226 (-0.251110%)
  L2 Accesses:             10855907 (-0.259451%)
  RAM Accesses:              120146 (-0.575959%)
  Estimated Cycles:       619438871 (-0.254053%)

bench_sine_gain_with_worklet
  Instructions:            64117929 (-2.694953%)
  L1 Accesses:             93607803 (-3.382749%)
  L2 Accesses:               212823 (-13.62002%)
  RAM Accesses:               56779 (+2.356144%)
  Estimated Cycles:        96659183 (-3.397432%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


@orottier
Copy link
Owner Author

Interesting regression on the delay nodes between version one and two.. 🤔

@orottier
Copy link
Owner Author

I can't reproduce the regression locally (not with cargo bench, and also the example/benchmarks.rs shows improvement for the delay test..)

Let's try again /bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2698601 (-15.15733%)
  L1 Accesses:              4043680 (-16.83723%)
  L2 Accesses:                  605 (-23.22335%)
  RAM Accesses:               54658 (-1.082235%)
  Estimated Cycles:         5959735 (-12.36029%)

bench_audio_buffer_decode
  Instructions:             7359289 (-0.027359%)
  L1 Accesses:             10164286 (+0.141183%)
  L2 Accesses:                28626 (+2.392961%)
  RAM Accesses:               31588 (-1.885386%)
  Estimated Cycles:        11412996 (-0.031270%)

bench_constant_source
  Instructions:             6668424 (-13.68374%)
  L1 Accesses:              9683205 (-16.54966%)
  L2 Accesses:                  780 (-15.76674%)
  RAM Accesses:               54109 (-3.990560%)
  Estimated Cycles:        11580920 (-14.72524%)

bench_sine
  Instructions:            59424551 (-1.934867%)
  L1 Accesses:             86676185 (-2.448281%)
  L2 Accesses:               209769 (-3.372334%)
  RAM Accesses:               54621 (-4.272770%)
  Estimated Cycles:        89636765 (-2.498824%)

bench_sine_gain
  Instructions:            62855330 (-3.352781%)
  L1 Accesses:             91683996 (-4.125568%)
  L2 Accesses:               215168 (-19.89278%)
  RAM Accesses:               54731 (-1.016404%)
  Estimated Cycles:        94675421 (-4.278826%)

bench_sine_gain_delay
  Instructions:            98691764 (+0.929558%)
  L1 Accesses:            139662347 (-1.203073%)
  L2 Accesses:               365796 (-15.49811%)
  RAM Accesses:               56860 (-0.833653%)
  Estimated Cycles:       143481427 (-1.410579%)

bench_buffer_src
  Instructions:            16682376 (-7.841744%)
  L1 Accesses:             24005114 (-9.905641%)
  L2 Accesses:                79499 (-5.076954%)
  RAM Accesses:               93693 (-0.113008%)
  Estimated Cycles:        27681864 (-8.779603%)

bench_buffer_src_delay
  Instructions:            81181619 (+7.955509%)
  L1 Accesses:            109492725 (+3.773454%)
  L2 Accesses:               254693 (-7.176101%)
  RAM Accesses:               93398 (-0.600243%)
  Estimated Cycles:       114035120 (+3.506549%)

bench_buffer_src_iir
  Instructions:            79580749 (-2.951053%)
  L1 Accesses:            111954775 (-3.795744%)
  L2 Accesses:                86285 (+0.766096%)
  RAM Accesses:               92836 (-0.613431%)
  Estimated Cycles:       115635460 (-3.692828%)

bench_buffer_src_biquad
  Instructions:            55531147 (-5.326189%)
  L1 Accesses:             74018742 (-7.255616%)
  L2 Accesses:               166285 (-22.93664%)
  RAM Accesses:               92980 (-0.609300%)
  Estimated Cycles:        78104467 (-7.198065%)

bench_stereo_positional
  Instructions:            40338909 (-11.38027%)
  L1 Accesses:             58888024 (-14.55775%)
  L2 Accesses:               460432 (+9.739803%)
  RAM Accesses:               93085 (-1.108066%)
  Estimated Cycles:        64448159 (-13.27559%)

bench_stereo_panning_automation
  Instructions:            26989470 (-7.748758%)
  L1 Accesses:             39783686 (-9.945168%)
  L2 Accesses:                84741 (-1.580683%)
  RAM Accesses:               93349 (-0.096320%)
  Estimated Cycles:        43474606 (-9.197220%)

bench_analyser_node
  Instructions:            34049850 (-8.358094%)
  L1 Accesses:             46474696 (-10.85063%)
  L2 Accesses:               127613 (-0.303123%)
  RAM Accesses:               94284 (-0.110183%)
  Estimated Cycles:        50412701 (-10.09748%)

bench_hrtf_panners
  Instructions:           319927019 (-0.233108%)
  L1 Accesses:            560954231 (-0.251072%)
  L2 Accesses:             10855876 (-0.261532%)
  RAM Accesses:              120146 (-0.575959%)
  Estimated Cycles:       619438721 (-0.254202%)

bench_sine_gain_with_worklet
  Instructions:            64117929 (-2.694953%)
  L1 Accesses:             93607803 (-3.382749%)
  L2 Accesses:               212823 (-13.62002%)
  RAM Accesses:               56779 (+2.356144%)
  Estimated Cycles:        96659183 (-3.397432%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


@orottier
Copy link
Owner Author

/bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2698601 (-15.15733%)
  L1 Accesses:              4043675 (-16.83730%)
  L2 Accesses:                  602 (-23.60406%)
  RAM Accesses:               54666 (-1.071338%)
  Estimated Cycles:         5959995 (-12.35735%)

bench_audio_buffer_decode
  Instructions:             7359267 (-0.027712%)
  L1 Accesses:             10164286 (+0.141173%)
  L2 Accesses:                28593 (+2.263948%)
  RAM Accesses:               31600 (-1.854210%)
  Estimated Cycles:        11413251 (-0.029790%)

bench_constant_source
  Instructions:             6680815 (-13.52335%)
  L1 Accesses:              9716251 (-16.26488%)
  L2 Accesses:                  763 (-17.42424%)
  RAM Accesses:               54103 (-3.999503%)
  Estimated Cycles:        11613671 (-14.48382%)

bench_sine
  Instructions:            59439577 (-1.910071%)
  L1 Accesses:             86718921 (-2.400788%)
  L2 Accesses:               208325 (-3.793312%)
  RAM Accesses:               54617 (-4.279781%)
  Estimated Cycles:        89672141 (-2.458006%)

bench_sine_gain
  Instructions:            62870376 (-3.329646%)
  L1 Accesses:             91734090 (-4.073188%)
  L2 Accesses:               213895 (-20.36553%)
  RAM Accesses:               54728 (-1.023619%)
  Estimated Cycles:        94719045 (-4.234738%)

bench_sine_gain_delay
  Instructions:            98733389 (+0.972131%)
  L1 Accesses:            139741713 (-1.147912%)
  L2 Accesses:               380687 (-11.77058%)
  RAM Accesses:               56863 (-0.830151%)
  Estimated Cycles:       143635353 (-1.301004%)

bench_buffer_src
  Instructions:            16723671 (-7.613680%)
  L1 Accesses:             24091703 (-9.580715%)
  L2 Accesses:                79221 (-5.408891%)
  RAM Accesses:               93694 (-0.108747%)
  Estimated Cycles:        27767098 (-8.498462%)

bench_buffer_src_delay
  Instructions:            81264523 (+8.065767%)
  L1 Accesses:            109665227 (+3.936952%)
  L2 Accesses:               251596 (-8.304146%)
  RAM Accesses:               93399 (-0.600236%)
  Estimated Cycles:       114192172 (+3.649083%)

bench_buffer_src_iir
  Instructions:            79610801 (-2.914419%)
  L1 Accesses:            112043320 (-3.719669%)
  L2 Accesses:                80319 (-6.201170%)
  RAM Accesses:               92833 (-0.614515%)
  Estimated Cycles:       115694070 (-3.643970%)

bench_buffer_src_biquad
  Instructions:            55561212 (-5.274906%)
  L1 Accesses:             74114642 (-7.135442%)
  L2 Accesses:               167996 (-22.14189%)
  RAM Accesses:               92975 (-0.616769%)
  Estimated Cycles:        78208747 (-7.074199%)

bench_stereo_positional
  Instructions:            40369147 (-11.31386%)
  L1 Accesses:             59049494 (-14.29591%)
  L2 Accesses:               438056 (-0.833526%)
  RAM Accesses:               93081 (-1.113366%)
  Estimated Cycles:        64497609 (-13.31255%)

bench_stereo_panning_automation
  Instructions:            27019591 (-7.645689%)
  L1 Accesses:             39874760 (-9.738930%)
  L2 Accesses:                80087 (-6.989141%)
  RAM Accesses:               93339 (-0.104884%)
  Estimated Cycles:        43542060 (-9.056152%)

bench_analyser_node
  Instructions:            34094913 (-8.236781%)
  L1 Accesses:             46582731 (-10.64337%)
  L2 Accesses:               128412 (+0.321091%)
  RAM Accesses:               94279 (-0.113364%)
  Estimated Cycles:        50524556 (-9.897879%)

bench_hrtf_panners
  Instructions:           319941904 (-0.228470%)
  L1 Accesses:            561034266 (-0.236838%)
  L2 Accesses:             10811052 (-0.673398%)
  RAM Accesses:              120196 (-0.557624%)
  Estimated Cycles:       619296386 (-0.277280%)

bench_sine_gain_with_worklet
  Instructions:            64132975 (-2.672119%)
  L1 Accesses:             93657270 (-3.332099%)
  L2 Accesses:               212180 (-13.73920%)
  RAM Accesses:               56773 (+2.350863%)
  Estimated Cycles:        96705225 (-3.349754%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


@orottier orottier marked this pull request as ready for review May 17, 2024 19:15
@orottier
Copy link
Owner Author

Hey @b-ma , the idea for this change was growing for a long time on me. Happy to see some benchmark improvements again. Could you have a look? I'm certainly interested what cargo bench reports for you (run first on main, then on this branch) for the delay tests. The IAI numbers look bad but I don't understand why and cargo bench and the benchmarks example.rs show improvements also for delay tests..

@orottier orottier requested a review from b-ma May 17, 2024 19:18
@b-ma
Copy link
Collaborator

b-ma commented May 18, 2024

Hey, I'm a bit out of networks this week-end, I'll have a look on Monday

@b-ma
Copy link
Collaborator

b-ma commented May 20, 2024

Hey, so on my side:

  1. With the benchmarks examples, all numbers seems to improve
  2. With cargo bench, it seems I have some small regression but on benches that involve a sine, that's a bit weird...
bench_sine              time:   [2.8590 ms 2.8640 ms 2.8685 ms]
                        change: [+2.4613% +3.0625% +3.6813%] (p = 0.00 < 0.05)
                        Performance has regressed.

bench_sine_gain         time:   [3.0442 ms 3.0501 ms 3.0555 ms]
                        change: [+2.7440% +3.4710% +4.2233%] (p = 0.00 < 0.05)
                        Performance has regressed.

do we agree that I have run cargo bench on main first and that the diff is made according to that? It seems logical but I'm not sure actually

}
thread_local! {
/// Thread local pool of pre-allocated sample slice that can be reused after being dropped
static POOL: RefCell<Vec<Rc<[f32; RENDER_QUANTUM_SIZE]>>> = RefCell::new(Vec::with_capacity(32));
Copy link
Collaborator

@b-ma b-ma May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we shouldn't have a larger initial pool? As a stereo delay of 1 second (default max_delay_time) uses 375 * 2 AudioRenderQuantumChannel, having something like 1024 preallocated buffers would guarantees we don't allocate at least when creating one delay

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We could definitely do it but we should realize this could happen on any thread and the memory is never reclaimed, so to be a bit conservative would not be bad.
I'm having a bit more thoughts about this setup though, when using async rendering the OfflineAudioContext can move to any thread of the executor and we may end up leaking a lot of memory.
I could specialize it to be of different size for offline and online rendering. Starting with a small pool for offline rendering is not bad because we can safely allocate more during rendering.

@orottier
Copy link
Owner Author

do we agree that I have run cargo bench on main first and that the diff is made according to that? It seems logical but I'm not sure actually

Yup that's how it works. Weird that your sines seem to be affected because none of this PR should matter there..

@b-ma
Copy link
Collaborator

b-ma commented May 22, 2024

Weird that your sines seem to be affected because none of this PR should matter there..

Yup I agree... this is quite weird

@orottier
Copy link
Owner Author

/bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2698601 (-15.15733%)
  L1 Accesses:              4043688 (-16.83718%)
  L2 Accesses:                  591 (-24.52107%)
  RAM Accesses:               54664 (-1.067796%)
  Estimated Cycles:         5959883 (-12.35698%)

bench_audio_buffer_decode
  Instructions:             7359239 (-0.027495%)
  L1 Accesses:             10164831 (+0.146641%)
  L2 Accesses:                28021 (+0.340185%)
  RAM Accesses:               31595 (-1.863643%)
  Estimated Cycles:        11410761 (-0.049411%)

bench_constant_source
  Instructions:             6742919 (-12.71947%)
  L1 Accesses:              9799089 (-15.55105%)
  L2 Accesses:                 1131 (+23.20261%)
  RAM Accesses:               55180 (-2.083259%)
  Estimated Cycles:        11736044 (-13.58194%)

bench_sine
  Instructions:            59501821 (-1.807353%)
  L1 Accesses:             86803033 (-2.302172%)
  L2 Accesses:               207492 (-5.739868%)
  RAM Accesses:               55684 (-2.418337%)
  Estimated Cycles:        89789433 (-2.345846%)

bench_sine_gain
  Instructions:            62932418 (-3.234249%)
  L1 Accesses:             91796990 (-4.015710%)
  L2 Accesses:               234010 (-10.10852%)
  RAM Accesses:               55779 (+0.868008%)
  Estimated Cycles:        94919305 (-4.000319%)

bench_sine_gain_delay
  Instructions:            98743617 (+0.982591%)
  L1 Accesses:            139745845 (-1.138707%)
  L2 Accesses:               387777 (-11.95891%)
  RAM Accesses:               56844 (-0.875388%)
  Estimated Cycles:       143674270 (-1.298775%)

bench_buffer_src
  Instructions:            16791604 (-7.238275%)
  L1 Accesses:             24180991 (-9.245511%)
  L2 Accesses:                81634 (-2.528895%)
  RAM Accesses:               92758 (-1.107711%)
  Estimated Cycles:        27835691 (-8.272462%)

bench_buffer_src_delay
  Instructions:            81280143 (+8.086544%)
  L1 Accesses:            109672677 (+3.947820%)
  L2 Accesses:               259788 (-6.631349%)
  RAM Accesses:               93427 (-0.571495%)
  Estimated Cycles:       114241562 (+3.679364%)

bench_buffer_src_iir
  Instructions:            79678323 (-2.832052%)
  L1 Accesses:            112130795 (-3.644606%)
  L2 Accesses:                83631 (-2.161935%)
  RAM Accesses:               92835 (-0.614502%)
  Estimated Cycles:       115798175 (-3.556823%)

bench_buffer_src_biquad
  Instructions:            55625750 (-5.164864%)
  L1 Accesses:             74190978 (-7.064934%)
  L2 Accesses:               178329 (-8.158315%)
  RAM Accesses:               93915 (+0.386946%)
  Estimated Cycles:        78369648 (-6.787345%)

bench_stereo_positional
  Instructions:            40434469 (-11.17037%)
  L1 Accesses:             59175968 (-14.18677%)
  L2 Accesses:               405469 (+6.146496%)
  RAM Accesses:               94024 (-0.110488%)
  Estimated Cycles:        64494153 (-13.03783%)

bench_stereo_panning_automation
  Instructions:            27086734 (-7.416229%)
  L1 Accesses:             39958635 (-9.549281%)
  L2 Accesses:                86675 (+0.777853%)
  RAM Accesses:               93345 (-0.108084%)
  Estimated Cycles:        43659085 (-8.811582%)

bench_analyser_node
  Instructions:            34180922 (-8.005296%)
  L1 Accesses:             46676908 (-10.46196%)
  L2 Accesses:               144365 (+12.39441%)
  RAM Accesses:               93353 (-1.091298%)
  Estimated Cycles:        50666088 (-9.648178%)

bench_hrtf_panners
  Instructions:           319998961 (-0.210682%)
  L1 Accesses:            561071372 (-0.229956%)
  L2 Accesses:             10850545 (-0.325648%)
  RAM Accesses:              120965 (+0.106756%)
  Estimated Cycles:       619557872 (-0.236050%)

bench_sine_gain_with_worklet
  Instructions:            64194679 (-2.578477%)
  L1 Accesses:             93736388 (-3.273193%)
  L2 Accesses:               215521 (-3.431326%)
  RAM Accesses:               57832 (+4.254399%)
  Estimated Cycles:        96838113 (-3.128759%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


With the new design of 'silent quantum' we can also instantiate silent
buffers on the control thread.
@orottier
Copy link
Owner Author

/bench

Copy link

Benchmark result:


bench_ctor
  Instructions:             2700496 (-15.09775%)
  L1 Accesses:              4046526 (-16.77882%)
  L2 Accesses:                  594 (-24.13793%)
  RAM Accesses:               53732 (-2.754552%)
  Estimated Cycles:         5930116 (-12.79472%)

bench_audio_buffer_decode
  Instructions:             7359287 (-0.026952%)
  L1 Accesses:             10164914 (+0.147419%)
  L2 Accesses:                28005 (+0.279300%)
  RAM Accesses:               31593 (-1.875951%)
  Estimated Cycles:        11410694 (-0.050690%)

bench_constant_source
  Instructions:             6743115 (-12.71694%)
  L1 Accesses:              9799379 (-15.54855%)
  L2 Accesses:                 1135 (+23.63834%)
  RAM Accesses:               55085 (-2.251837%)
  Estimated Cycles:        11733029 (-13.60414%)

bench_sine
  Instructions:            59501808 (-1.807374%)
  L1 Accesses:             86788858 (-2.318126%)
  L2 Accesses:               221639 (+0.686876%)
  RAM Accesses:               55687 (-2.413080%)
  Estimated Cycles:        89846098 (-2.284218%)

bench_sine_gain
  Instructions:            62932512 (-3.234105%)
  L1 Accesses:             91795499 (-4.017271%)
  L2 Accesses:               235654 (-9.476304%)
  RAM Accesses:               55779 (+0.868008%)
  Estimated Cycles:        94926034 (-3.993506%)

bench_sine_gain_delay
  Instructions:            98454324 (+0.686738%)
  L1 Accesses:            139341680 (-1.424628%)
  L2 Accesses:               371325 (-15.69418%)
  RAM Accesses:               56820 (-0.917239%)
  Estimated Cycles:       143187005 (-1.633515%)

bench_buffer_src
  Instructions:            16791593 (-7.238562%)
  L1 Accesses:             24181318 (-9.244457%)
  L2 Accesses:                81292 (-2.937243%)
  RAM Accesses:               92764 (-1.101315%)
  Estimated Cycles:        27834518 (-8.276481%)

bench_buffer_src_delay
  Instructions:            80991007 (+7.702068%)
  L1 Accesses:            109261887 (+3.558487%)
  L2 Accesses:               250231 (-10.06649%)
  RAM Accesses:               93420 (-0.578945%)
  Estimated Cycles:       113782742 (+3.262972%)

bench_buffer_src_iir
  Instructions:            79678314 (-2.832072%)
  L1 Accesses:            112131623 (-3.643900%)
  L2 Accesses:                82759 (-3.182068%)
  RAM Accesses:               92840 (-0.610213%)
  Estimated Cycles:       115794818 (-3.559652%)

bench_buffer_src_biquad
  Instructions:            55625740 (-5.164881%)
  L1 Accesses:             74242514 (-7.000381%)
  L2 Accesses:               126738 (-34.72833%)
  RAM Accesses:               93928 (+0.404062%)
  Estimated Cycles:        78163684 (-7.032205%)

bench_stereo_positional
  Instructions:            40441833 (-11.15420%)
  L1 Accesses:             59196884 (-14.15644%)
  L2 Accesses:               444293 (+16.31103%)
  RAM Accesses:               94029 (-0.105176%)
  Estimated Cycles:        64709364 (-12.74763%)

bench_stereo_panning_automation
  Instructions:            27086718 (-7.416207%)
  L1 Accesses:             39961012 (-9.543847%)
  L2 Accesses:                84241 (-2.051044%)
  RAM Accesses:               93352 (-0.101662%)
  Estimated Cycles:        43649537 (-8.831532%)

bench_analyser_node
  Instructions:            34180898 (-8.005390%)
  L1 Accesses:             46680352 (-10.45537%)
  L2 Accesses:               140861 (+9.668956%)
  RAM Accesses:               93356 (-1.092311%)
  Estimated Cycles:        50652117 (-9.673313%)

bench_hrtf_panners
  Instructions:           320000238 (-0.210280%)
  L1 Accesses:            561123163 (-0.220779%)
  L2 Accesses:             10810325 (-0.693271%)
  RAM Accesses:              120939 (+0.086068%)
  Estimated Cycles:       619407653 (-0.260101%)

bench_sine_gain_with_worklet
  Instructions:            64194665 (-2.578499%)
  L1 Accesses:             93733858 (-3.275944%)
  L2 Accesses:               218022 (-2.248944%)
  RAM Accesses:               57834 (+4.256125%)
  Estimated Cycles:        96848158 (-3.118197%)


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants