Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect UQRSHL implementation. #1272

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Syonyk
Copy link
Contributor

@Syonyk Syonyk commented Jan 30, 2025

UQRSHL was incorrect in several ways, and did not match hardware. Shift checks for saturating were not modified from the signed version, leading to incorrect behavior at (untested) edge cases.

Further, for the 32-bit and 64-bit versions, the rounding math was incorrect with a -32 or -64 shift, and did not match hardware. This has been special cased to do the proper thing at the edge case. The core issue is overflowing the stock 32-bit or 64-bit types used at the limit.

Tests have been modified with substantially increased test cases to properly exercise these changes. The new tests should cause failures in the unsigned cases without the modified code, and should pass with the changes. Additional signed tests were added to handle the edge cases, though no incorrect behavior was observed on those.

Test cases generated on a Google C4a ARMv8.4 machine.

UQRSHL was incorrect in several ways, and did not match hardware.
Shift checks for saturating were not modified from the signed
version, leading to incorrect behavior at (untested) edge cases.

Further, for the 32-bit and 64-bit versions, the rounding math was
incorrect with a -32 or -64 shift, and did not match hardware.  This
has been special cased to do the proper thing at the edge case.  The
core issue is overflowing the stock 32-bit or 64-bit types used at
the limit.

Tests have been modified with substantially increased test cases to
properly exercise these changes.  The new tests should cause failures
in the unsigned cases without the modified code, and should pass with
the changes.  Additional signed tests were added to handle the edge
cases, though no incorrect behavior was observed on those.

Test cases generated on a Google C4a ARMv8.4 machine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant