-
Notifications
You must be signed in to change notification settings - Fork 4
add #[inline] to small and single-use functions #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Checking on x86_64. |
|
Can you fix the bencher source code please? Markdown broken apparently. ^^; |
|
Fixed, by adding newlines around |
|
Thanks! Here are the results:
|
|
Surprisingly, |
| Ok(b) | ||
| } | ||
|
|
||
| /// Converts the result of a unicode escape to the unit type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please keep the doc comment, always useful. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc comment is still there on the function def in the trait, so repeating it 3 times seems silly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, then please ignore my comment.
Not to mention unfortunate :(. How are the results without the inline on Check::check_raw? |
|
Without
I didn't expect the difference to be this big. O.o |
eb3294d to
73aae9e
Compare
|
|
src/lib.rs
Outdated
| /// which are returned via `callback`. | ||
| /// | ||
| /// NOTE: Does no escaping, but produces errors for bare carriage return ('\r'). | ||
| #[cfg_attr(any(target_arch = "aarch64", target_arch = "arm"), inline)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having a cfg_attr, what are the results without this attribute on ARM? If they're not too bad, I think it'd be better to not start having conditional inlining for potential performance improvement (which might change any time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this is the only inline that generates all the performance wins on arm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Ok, then please add a comment explaining why we have this attribute and what to check if it needs to be updated.
In the meantime, runnning benches with this new change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe I misremembered slightly, as there are some small wins:
| bench name | current | new | diff |
|---|---|---|---|
| bench_check_raw_byte_str_ascii | 44628.4 ns/iter (+/- 102.2) | 44629.32 ns/iter (+/- 73.66) | 0.0% |
| bench_check_raw_c_str_ascii | 41234.96 ns/iter (+/- 121.9) | 41071.26 ns/iter (+/- 47.96) | -0.4% |
| bench_check_raw_c_str_non_ascii | 52474.5 ns/iter (+/- 72.77) | 52578.81 ns/iter (+/- 115.4) | 0.2% |
| bench_check_raw_c_str_unicode | 174637.42 ns/iter (+/- 1521.31) | 173704.05 ns/iter (+/- 1150.51) | -0.5% |
| bench_check_raw_str_ascii | 40644.14 ns/iter (+/- 93.49) | 40668.03 ns/iter (+/- 91.28) | 0.1% |
| bench_check_raw_str_non_ascii | 50437.34 ns/iter (+/- 81.88) | 50439.37 ns/iter (+/- 92.4) | 0.0% |
| bench_check_raw_str_unicode | 167964.36 ns/iter (+/- 613.95) | 168088.36 ns/iter (+/- 368.93) | 0.1% |
| bench_skip_ascii_whitespace | 13081.61 ns/iter (+/- 31.3) | 13095.27 ns/iter (+/- 14.95) | 0.1% |
| bench_unescape_byte_str_ascii | 68916.92 ns/iter (+/- 168.16) | 69179.61 ns/iter (+/- 393.16) | 0.4% |
| bench_unescape_byte_str_ascii_escape | 81236.48 ns/iter (+/- 177.29) | 81247.42 ns/iter (+/- 180.38) | 0.0% |
| bench_unescape_byte_str_hex_escape | 125728.99 ns/iter (+/- 256.18) | 125697.59 ns/iter (+/- 253.15) | -0.0% |
| bench_unescape_byte_str_mixed_escape | 348869.9 ns/iter (+/- 2290.38) | 348345.55 ns/iter (+/- 1578.4) | -0.2% |
| bench_unescape_c_str_ascii | 81575.56 ns/iter (+/- 180.46) | 76510.93 ns/iter (+/- 237.2) | -6.2% |
| bench_unescape_c_str_ascii_escape | 96108.78 ns/iter (+/- 1281.04) | 97554.84 ns/iter (+/- 160.5) | 1.5% |
| bench_unescape_c_str_hex_escape_ascii | 161725.35 ns/iter (+/- 1096.9) | 153620.12 ns/iter (+/- 295.55) | -5.0% |
| bench_unescape_c_str_hex_escape_byte | 162388.73 ns/iter (+/- 504.69) | 149612.35 ns/iter (+/- 236.66) | -7.9% |
| bench_unescape_c_str_mixed_escape | 1045466.2 ns/iter (+/- 5675.85) | 1016333.3 ns/iter (+/- 2407.72) | -2.8% |
| bench_unescape_c_str_non_ascii | 110547.5 ns/iter (+/- 327.76) | 95382.77 ns/iter (+/- 314.99) | -13.7% |
| bench_unescape_c_str_unicode | 385332.7 ns/iter (+/- 2524.07) | 341268.3 ns/iter (+/- 1450.59) | -11.4% |
| bench_unescape_c_str_unicode_escape | 308526.52 ns/iter (+/- 1321.49) | 310357.6 ns/iter (+/- 709.15) | 0.6% |
| bench_unescape_str_ascii | 77139.01 ns/iter (+/- 175.61) | 73224.7 ns/iter (+/- 185.46) | -5.1% |
| bench_unescape_str_ascii_escape | 105477.82 ns/iter (+/- 313.16) | 103074.02 ns/iter (+/- 1257.79) | -2.3% |
| bench_unescape_str_hex_escape | 165270.16 ns/iter (+/- 577.06) | 165128.7 ns/iter (+/- 629.91) | -0.1% |
| bench_unescape_str_mixed_escape | 906241.1 ns/iter (+/- 2452.01) | 905221.2 ns/iter (+/- 2022.47) | -0.1% |
| bench_unescape_str_non_ascii | 98002.09 ns/iter (+/- 341.61) | 97892.88 ns/iter (+/- 267.58) | -0.1% |
| bench_unescape_str_unicode | 344989.9 ns/iter (+/- 651.18) | 343551.35 ns/iter (+/- 1453.84) | -0.4% |
| bench_unescape_str_unicode_escape | 596409.25 ns/iter (+/- 2117.97) | 595745.65 ns/iter (+/- 2342.47) | -0.1% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the results:
| bench name | current | new | diff |
|---|---|---|---|
| bench_check_raw_byte_str_ascii | 20597.79 ns/iter (+/- 111.91) | 21108.64 ns/iter (+/- 682.89) | 2.5% |
| bench_check_raw_c_str_ascii | 20242.44 ns/iter (+/- 35.66) | 20105.08 ns/iter (+/- 72.01) | -0.7% |
| bench_check_raw_c_str_non_ascii | 37288.94 ns/iter (+/- 60.57) | 37305.4 ns/iter (+/- 67.56) | 0.0% |
| bench_check_raw_c_str_unicode | 115247.8 ns/iter (+/- 267.7) | 115218.52 ns/iter (+/- 222.48) | -0.0% |
| bench_check_raw_str_ascii | 19811.75 ns/iter (+/- 34.42) | 19775.9 ns/iter (+/- 38.97) | -0.2% |
| bench_check_raw_str_non_ascii | 35882.4 ns/iter (+/- 120.81) | 35862.94 ns/iter (+/- 37.65) | -0.1% |
| bench_check_raw_str_unicode | 109795.61 ns/iter (+/- 318.02) | 109668.07 ns/iter (+/- 252.03) | -0.1% |
| bench_skip_ascii_whitespace | 6310.7 ns/iter (+/- 29.53) | 6316.3 ns/iter (+/- 18.24) | 0.1% |
| bench_unescape_byte_str_ascii | 43636.65 ns/iter (+/- 193.89) | 43529.61 ns/iter (+/- 168.8) | -0.2% |
| bench_unescape_byte_str_ascii_escape | 49115.14 ns/iter (+/- 81.32) | 49232.31 ns/iter (+/- 355.52) | 0.2% |
| bench_unescape_byte_str_hex_escape | 78711.55 ns/iter (+/- 203.76) | 77679.72 ns/iter (+/- 126.33) | -1.3% |
| bench_unescape_byte_str_mixed_escape | 217949.42 ns/iter (+/- 396.08) | 182727.73 ns/iter (+/- 5340.48) | -16.2% |
| bench_unescape_c_str_ascii | 54202.44 ns/iter (+/- 213.68) | 37968.36 ns/iter (+/- 1634.7) | -30.0% |
| bench_unescape_c_str_ascii_escape | 66313.15 ns/iter (+/- 295.46) | 59751.49 ns/iter (+/- 347.79) | -9.9% |
| bench_unescape_c_str_hex_escape_ascii | 105983.81 ns/iter (+/- 316.38) | 85106.36 ns/iter (+/- 643.56) | -19.7% |
| bench_unescape_c_str_hex_escape_byte | 106426.81 ns/iter (+/- 285.65) | 85051.97 ns/iter (+/- 328.0) | -20.1% |
| bench_unescape_c_str_mixed_escape | 649798.95 ns/iter (+/- 2989.2) | 557485.15 ns/iter (+/- 10110.31) | -14.2% |
| bench_unescape_c_str_non_ascii | 76784.48 ns/iter (+/- 356.41) | 62791.62 ns/iter (+/- 244.16) | -18.2% |
| bench_unescape_c_str_unicode | 262566.75 ns/iter (+/- 3679.59) | 214002.85 ns/iter (+/- 3932.69) | -18.5% |
| bench_unescape_c_str_unicode_escape | 185377.66 ns/iter (+/- 621.7) | 163651.8 ns/iter (+/- 935.65) | -11.7% |
| bench_unescape_str_ascii | 44971.66 ns/iter (+/- 349.02) | 45062.01 ns/iter (+/- 312.33) | 0.2% |
| bench_unescape_str_ascii_escape | 66951.3 ns/iter (+/- 347.23) | 66292.56 ns/iter (+/- 137.83) | -1.0% |
| bench_unescape_str_hex_escape | 102903.64 ns/iter (+/- 173.36) | 101440.36 ns/iter (+/- 282.0) | -1.4% |
| bench_unescape_str_mixed_escape | 743504.7 ns/iter (+/- 2975.47) | 732506.6 ns/iter (+/- 5604.11) | -1.5% |
| bench_unescape_str_non_ascii | 63212.57 ns/iter (+/- 183.07) | 63535.35 ns/iter (+/- 103.64) | 0.5% |
| bench_unescape_str_unicode | 217130.85 ns/iter (+/- 427.54) | 217462.52 ns/iter (+/- 837.24) | 0.2% |
| bench_unescape_str_unicode_escape | 348727.47 ns/iter (+/- 569.06) | 349065.45 ns/iter (+/- 4820.3) | 0.1% |
Not many changes, so all good. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw your comment. Well now all platforms have comparable numbers, that's good! Let me write a summary at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken this out for now, to leave only uncontroversial cases of inline.
73aae9e to
a8211d0
Compare
|
Summary:
Seems all good to me, thanks for the performance improvement! |
|
cc @Urgau :) |
|
Or I can try @estebank. Can you merge please? :) |
Urgau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine.
On ARM:
Someone please check x86_64!
BTW I made some small improvements (I hope) to the bench script (mostly in
get_good_results):