Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Helix and some refactor #2

Merged
merged 4 commits into from
Oct 17, 2017
Merged

Conversation

chancancode
Copy link
Contributor

Hello, this is very cool project and a great use case for Helix!

I pulled the code to play with it a little bit locally. While I was at that I found some small improvements so I thought I should submit a PR.

I chunked up the changes and included a lot of details and benchmark results in the commit message. You may find it easier to review by reading the commits one-by-one. Feel free to cherry pick or edit them as you see fit.

Let me know if you have any questions. By the way, if you have any questions or suggestions for Helix please feel free to let me know (or open an issue). I would also like to know what are some missing features you might need for this library so we know how to prioritize them.

Thanks for trying Helix!!

- Upgrade to new coercion protocol (tildeio/helix#122)
- `throw!` -> `raise_panic!` (tildeio/helix#122)
- Remove custom coercion for `Vec` and friends (tildeio/helix#126)
- Remove custom bindings for `rb_ary_*` (tildeio/helix#126)

There are no noticible difference in performance based on my testing.

===================== MASTER ====================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    43.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    106.761  (± 1.9%) i/s -    540.000  in   5.059882s
               rscsv    464.264  (± 3.7%) i/s -      2.322k in   5.008913s

Comparison:
               rscsv:      464.3 i/s
            Ruby CSV:      106.8 i/s - 4.35x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    26.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     87.078  (± 4.6%) i/s -    440.000  in   5.066875s
               rscsv    265.640  (± 2.6%) i/s -      1.352k in   5.093080s

Comparison:
               rscsv:      265.6 i/s
            Ruby CSV:       87.1 i/s - 3.05x  slower

===================== AFTER =====================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    41.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    106.004  (± 1.9%) i/s -    530.000  in   5.001587s
               rscsv    463.825  (± 3.9%) i/s -      2.337k in   5.047190s

Comparison:
               rscsv:      463.8 i/s
            Ruby CSV:      106.0 i/s - 4.38x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    25.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     86.197  (± 4.6%) i/s -    432.000  in   5.022468s
               rscsv    267.183  (± 3.4%) i/s -      1.350k in   5.058534s

Comparison:
               rscsv:      267.2 i/s
            Ruby CSV:       86.2 i/s - 3.10x  slower
This is possible since tildeio/helix#122. It is more idomatic in
Rust to return `Result` for failable operations instead of using
`panic` as the previous code did. Either style will result in an
exception when crossing back into the Ruby side so this does not
change behavior.

Again, any performance difference is well within the margins:

===================== MASTER ====================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    44.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    106.872  (± 1.9%) i/s -    540.000  in   5.055408s
               rscsv    479.503  (± 4.2%) i/s -      2.420k in   5.056628s

Comparison:
               rscsv:      479.5 i/s
            Ruby CSV:      106.9 i/s - 4.49x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    26.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     89.902  (± 3.3%) i/s -    456.000  in   5.079214s
               rscsv    260.477  (± 1.5%) i/s -      1.326k in   5.091759s

Comparison:
               rscsv:      260.5 i/s
            Ruby CSV:       89.9 i/s - 2.90x  slower

==================== AFTER ======================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    42.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    107.340  (± 1.9%) i/s -    540.000  in   5.032154s
               rscsv    464.876  (± 4.1%) i/s -      2.352k in   5.069027s

Comparison:
               rscsv:      464.9 i/s
            Ruby CSV:      107.3 i/s - 4.33x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    27.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     89.271  (± 3.4%) i/s -    448.000  in   5.024682s
               rscsv    266.167  (± 3.4%) i/s -      1.350k in   5.077559s

Comparison:
               rscsv:      266.2 i/s
            Ruby CSV:       89.3 i/s - 2.98x  slower
Instead of implementing the custom coercion with unsafe code, we
could simply rely on the coercion for `Vec` and `String` to do the
job.

Unfortunately, this regresses performance noticeably (by around a
third). Presumably, this is due to the extra heap allocations (one
extra Rust `Vec` per row and one extra Rust heap `String` per cell).

That being said, the Rust version is still easily over twice as
fast, so this might still be an acceptable tradeoff depending on the
goals. Notably, unlike Ruby allocations, these Rust allocation does
not need to be GC'ed (so there is no marking/sweeping cost) and does
not contribute to bloating the Ruby heap.

===================== MASTER ====================

Warming up --------------------------------------
            Ruby CSV     9.000  i/100ms
               rscsv    40.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     99.473  (± 4.0%) i/s -    504.000  in   5.074626s
               rscsv    440.352  (± 5.5%) i/s -      2.200k in   5.010808s

Comparison:
               rscsv:      440.4 i/s
            Ruby CSV:       99.5 i/s - 4.43x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    24.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     81.908  (± 4.9%) i/s -    416.000  in   5.088703s
               rscsv    248.341  (± 3.2%) i/s -      1.248k in   5.031124s

Comparison:
               rscsv:      248.3 i/s
            Ruby CSV:       81.9 i/s - 3.03x  slower

===================== AFTER =====================

Warming up --------------------------------------
            Ruby CSV     9.000  i/100ms
               rscsv    39.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    101.557  (± 3.0%) i/s -    513.000  in   5.055869s
               rscsv    401.208  (± 4.2%) i/s -      2.028k in   5.063240s

Comparison:
               rscsv:      401.2 i/s
            Ruby CSV:      101.6 i/s - 3.95x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    18.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     82.065  (± 3.7%) i/s -    416.000  in   5.076569s
               rscsv    181.157  (± 3.3%) i/s -    918.000  in   5.072312s

Comparison:
               rscsv:      181.2 i/s
            Ruby CSV:       82.1 i/s - 2.21x  slower
This is a strawman proposal on top of the previous commit to show a
plausible reshuffle that avoids the String allocation (1 per cell).

The main constraint is that something has to own the backing storage
of an `&str` (the `csv::Reader`), therefore the parsing and the
"copying into Ruby's heap" must happen inside the same stack frame
(i.e. we cannot just return `Vec<Vec<&'a str>>` as there is no valid
`'a` that would live long enough).

This is just the smallest possible delta that does the trick. It is
somewhat inelegant for serveral reasons (e.g. it breaks the previous
abstraction boundaries, leaks Ruby semantics into the Rust code  and
does not handle the possible `ToRuby` failure). There are more clever
ways to structure it to avoid some of the downsides, but this is just
to show that it _can_ be done (without writing unsafe coercion code).

More importantly, this shows the performance impact of the `String`
allocations. Surprisingly, it appears that this change is sufficient
to recoup all of the lost performance from the previous commit.

Thinking about it, it makes sense. With the right `size_hint()`,
allocating and filling the `Vec`s are quite cheap and can be easily
optimized. More importantly that operation only occurs once per row.
The `String` allocations are much more costly (since it cannot be
pre-sized) and has to happen once per cell. Therefore, it makes sense
that eliminating the `String` allocations would make a huge impact.

===================== MASTER ====================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    44.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    106.381  (± 3.8%) i/s -    540.000  in   5.083809s
               rscsv    445.078  (± 3.8%) i/s -      2.244k in   5.050089s

Comparison:
               rscsv:      445.1 i/s
            Ruby CSV:      106.4 i/s - 4.18x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    26.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     86.912  (± 4.6%) i/s -    440.000  in   5.074026s
               rscsv    264.238  (± 4.5%) i/s -      1.326k in   5.029925s

Comparison:
               rscsv:      264.2 i/s
            Ruby CSV:       86.9 i/s - 3.04x  slower

===================== AFTER =====================

Warming up --------------------------------------
            Ruby CSV    10.000  i/100ms
               rscsv    40.000  i/100ms
Calculating -------------------------------------
            Ruby CSV    107.672  (± 2.8%) i/s -    540.000  in   5.019673s
               rscsv    460.184  (± 5.9%) i/s -      2.320k in   5.064426s

Comparison:
               rscsv:      460.2 i/s
            Ruby CSV:      107.7 i/s - 4.27x  slower

Warming up --------------------------------------
            Ruby CSV     8.000  i/100ms
               rscsv    26.000  i/100ms
Calculating -------------------------------------
            Ruby CSV     88.117  (± 3.4%) i/s -    440.000  in   5.000851s
               rscsv    268.792  (± 4.1%) i/s -      1.352k in   5.039892s

Comparison:
               rscsv:      268.8 i/s
            Ruby CSV:       88.1 i/s - 3.05x  slower
@chancancode
Copy link
Contributor Author

By the way, I am pretty sure the the bytes_record_to_ruby code can be rewritten to lean on the slice/vec/str coercion in Helix with basically no performance difference (especially if you rewrite it to use StringRecrod instead of BytesRecord). However, I don't understand that part of the code (and the csv library) fully so I left it untouched.

Furthermore, I suspect the split between each (defined in Ruby) and each_internal is not "necessary" and doesn't quite do what you think. Calling rb_funcall and rb_yield without rb_protect is highly unsafe and undefined behavior in Rust, as any Ruby exception (not only the break you attempted to handle) will jump through the Rust stack directly back into Ruby, skipping any necessary Rust unwinding. If that happens, it will cause serious memory corruption, turning the Ruby process into a segfault time bomb, so rescuing the StopIteration from Ruby at that point probably doesn't add that much value.

We are working on making a completely safe version of that API, but since you seem perfectly capable of figuring out the raw C-API, you can probably just do the necessary rb_protect dance within the Rust code and drop the each vs each_internal split. 😄

@lautis
Copy link
Owner

lautis commented Oct 17, 2017

Thanks! I'll need to look into rb_protect. For the exception handling my needs have been quite limited so far. 😊

Regarding Helix features, binary distribution would be great. I assume there will be more gems like this around when you don't need to have rustc installed or depend on the exact Helix version in all gems.

@lautis lautis merged commit 67aea6a into lautis:master Oct 17, 2017
@chancancode
Copy link
Contributor Author

Thanks for the quick review!

For your reference – I mentioned in the commit message that I thought the last commit is somewhat inelegant (although the ownership issue is fundamental). I came up with an alternative closer to the original, which contains/limits the coercion leakage but still avoid writing unsafe code: chancancode@d6c4f60

In my quick testing it seems marginally slower, and I couldn't figure out why. In theory, it should be the same amount of work as the approach in this PR. Perhaps I am just not juggling some things correctly. I'll leave that here for you to look into, if you are interested.

@chancancode
Copy link
Contributor Author

@lautis I forgot to mention – for binary distribution, check out the thread on Fosslim/license_matcher#10

We need to smooth out the process and include it by default in the generators, but I am sure you can figure out how to replicate it in the mean time. I think something like that plus what you did here (requiring the rust compiler in the extconf for “other platforms”) is the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants