-
Notifications
You must be signed in to change notification settings - Fork 381
Faster implementation of long and int rotation #10187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Thomas May <[email protected]>
| i = i << 1 | ((i < 0) ? 1 : 0); | ||
| } | ||
| return i; | ||
| int rotateValue = distance % SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like you should try shifting by any value outside the range this guarantees (which isn't 0 to 31, by the way, it's -31 to 31) in the browser's JS Console. You shouldn't just take my word for it, but the modulus is totally unnecessary here; shifts are implicitly done with a mask of 31 in JS, and in Java for ints.
I would not be at all surprised if performing modulus on JS Number types is slower than the loop before, and it's completely unnecessary. A mask would arguably be more clear, and wouldn't permit negative rotateValues if that's a concern. A mask isn't needed either.
| i = i << 1 | ((i < 0) ? 1 : 0); | ||
| } | ||
| return i; | ||
| int rotateValue = distance % SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use distance & 63 here, if it's needed at all. GWT may already abide by the JLS and implicitly mask shifts on longs by 63, but I can't confirm it. This is internally a floating-point modulus because GWT int is JS Number (IEEE double-precision float that JS sometimes treats like a 32-bit integer), and floating-point modulus is the slowest basic arithmetic operation according to https://www.agner.org/optimize/ ; on Intel architectures, x87 FPREM takes reliably twice as long as FDIV, and there isn't a vector floating-point remainder operation that I can see. There doesn't appear to be an integer remainder operation there either, so it's probably produced by DIV or IDIV, which isn't much faster, unless a mask can be substituted in (which it probably can't, because if distance is negative, the result is different for distance % 64 and distance & 63, even if the shifts for negative rotateValue behave the same).
tommyettinger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! All the expensive operations are gone and we have correct bitwise rotations, AKA barrel shifts. They were called barrel shifts long before Java existed. I wonder a little about what JS gets emitted here and what common JS engines compile it down to, but I'm sure the handful of loop-less, branch-less operations here will be much better.
Fixes #9563
I picked up the implementation by @cogman from gerrit and added a few testcases.