Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply recent C optimizations to Java encoder #725

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

headius
Copy link
Contributor

@headius headius commented Jan 9, 2025

Just catching up with all of @byroot's excellent optimization work.

This is new specialized logic to reduce overhead when appending
ASCII-only strings to the generated JSON.

Original code by @byroot

See ruby#620
@headius
Copy link
Contributor Author

headius commented Jan 9, 2025

As part of this I'll be trying to align more of the Java code with the C equivalents, with comments to indicate how they sync up. This should make it easier to keep them in sync in the future.

Also includes updated logic for generate (generate_json_string)
based on current C code.

Original code by @byroot

See ruby#620
Lots of surrounding state so just take the hit of a Set and
Iterator rather than a big visitor object.
This change duplicates some code from JRuby to allow rendering the
fixnum value to a shared byte array rather than allocating new for
each value. Since fixnum dumping is a leaf operation, only one is
needed per session.
@headius
Copy link
Contributor Author

headius commented Jan 9, 2025

I jumped down the optimization well and continued past the recent string optimizations on to the other types of dumpable objects. Strings are now faster than in CRuby, as well as a few other cases in the encoder.rb benchmark, but many cases are still slower... sometimes less than half the performance.

Tracking results here: https://gist.github.com/headius/3e56d80656543bf2343f4b26f00bc446

Anonymous classes show up as unnamed, numbered classes in profiles
which makes them difficult to read.
Rather than allocating a buffer to hold N copies of arrayNL, just
write it N times. We're buffering into a stream anyway.

This makes array dumping zero-alloc other than buffer growth.
Since there's a fixed number of types we have special dumping logic
for, this abstraction just introduces overhead we don't need. This
patch starts moving away from indirecting all dumps through the
Handler abstraction and directly generating from the type switch.
This also aligns better with the main loop of the C code and should
inline and optimize better.
The byte[] output stream used here extended ByteArrayOutputStream
from the JDK, which sychronizes all mutation operations (like
writes). Since this is only going to be used once within a given
call stack, it needs no synchronization.

This change more than triples the performance of a benchmark of
dumping an array of empty arrays and should increase performance
of all dump forms.
* Return incoming array if only one repeat is needed and array is
  exact size.
* Only retrieve ByteList fields once for repeat writes.
@headius
Copy link
Contributor Author

headius commented Jan 15, 2025

A nice discovery: the default ByteArrayOutputStream we extend for our ByteList version uses synchronized on its write methods, and removing that (by avoiding the JDK impl) improves performance of dumping substantially.

@headius headius changed the title Apply recent C optimizations to Java extension Apply recent C optimizations to Java encoder Jan 29, 2025
@headius headius marked this pull request as ready for review January 29, 2025 03:54
@headius
Copy link
Contributor Author

headius commented Jan 29, 2025

In the interest of getting this part merged I've narrowed scope to the encoder. I'll need to incorporate a handful of recent changes, but they're not too extensive.

@byroot
Copy link
Member

byroot commented Jan 29, 2025

I've pre-merged your changes to the benchmark, so it should be easier for you to rebase.

Feel free to merge whenever you fell like it's ready, as previously said, as far as I'm concerned, you are the main maintainer of the Java parts.

Please just try to have a clean git history before merging.

@headius
Copy link
Contributor Author

headius commented Jan 29, 2025

Please just try to have a clean git history before merging.

What do you mean by "clean git history"?

@byroot
Copy link
Member

byroot commented Jan 29, 2025

What do you mean by "clean git history"?

Well for instance in 4d37e9f you made some changes and forgot to remove some unused imports, and then 3 commits later on there is 98cb785 which remove the unused import.

Both of this should be a single commit really.

But if you don't care about the git history, just squash it all I guess.

@headius
Copy link
Contributor Author

headius commented Jan 29, 2025

I care about the git history and don't want to squash, but I'm not going to rewrite 24 commits just to regroup cleanup changes. Import cleanup (like code reformatting or whitespace changes) has no effect on the build or on run-time functionality. I always try to keep things clean but this will happen from time to time.

@byroot
Copy link
Member

byroot commented Jan 29, 2025

I'm not going to rewrite 24 commits just to regroup cleanup changes.

Then squash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants