-
-
Notifications
You must be signed in to change notification settings - Fork 40
Benchmark #1
Comments
More benchmark data: |
There are also some good benchmarks in the python performance suite. https://github.com/python/performance/tree/master/performance/benchmarks (see bm_json_loads.py and bm_json_dumps.py). |
Nice find @ethanhs. One would have to check, which cases we already cover and add the remaining ones to our benchmark. If there's anything in particular we aren't testing yet, we would be happy to merge PRs which add that. |
I'll take a look at that some time this weekend. FWIW, it seems that there is a huge performance hit on Windows. I ran
|
@ethanhs thanks for your tests. We made quite a few improvements recently, so it might be worth trying again. You can simply install with |
Sorry for the delay, I ended up having to build from source since I am running Python 3.7, and I had to figure out that yajl doesn't support 3.7 (Pipenv was particularly unhelpful here) Anyway, here are the results of my latest run (I switched processors from an i7-5820k to a Ryzen Threadripper 1950X):
All around things look much much better. I'm impressed with how well hyperjson performs at the moment, even beating ujson several times! |
While there are still some outliers this is beginning to look competitive. Does anybody know of some webservice for profiling code? Was already thinking about getting a Hetzner Cloud instance and connecting it to Github with some API... |
Hi! I recently switched from Edit: I just saw #64, so I can see it is already integrated in your scripts, just not visible in the figures from the README |
Sure thing. @alexprengere would you be interested in creating a PR for adding it? |
As I wrote in the edit, I think this is already done 😉 with 02e1884. |
Would for sure be interesting to run |
So here are the plots on my Fedora 33 Virtualbox. I can post the full reports if necessary. A couple of pain points I encountered when running those:
|
Thanks for running the benchmark. Not gonna lie, that looks freakin' impressive. I wonder if we could get some bare-metal benchmarks as well but I doubt there will be any relative difference. |
I guess at some point we have to talk about the future roadmap for hyperjson. It doesn't make much sense to maintain a crate which is obviously so much slower than a similar crate. (Even though hyperjson came first and I like to think that it paved the way for what came after.) There is an entire discussion to be had about JSON serialization for Python and other dynamic languages. The point is, raw serialization is rarely the bottleneck; it's usually creating many Python objects. (Note that this statement is only based on prior experimentation. I don't have any profiling data to proof it.) On that note I was thinking about an optional "lazy" object creation layer that would only create objects if used in Python. It would obviously be quite a bit of work and I wonder if it can even be easily done with the current codebase, but it's an idea worth pursuing nevertheless. If anyone wants to tackle that, feel free to try or ping me. I can open a new issue if people react with a 👍. |
We should test hyperjson with the benchmark data provided in gojay.
Thanks to @arnecls for the link.
The text was updated successfully, but these errors were encountered: