You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 27, 2024. It is now read-only.
Different vector units handle alignment in interesting ways. ARM/NEON supports fixing up unaligned accesses at runtime, or trapping on them depending on the instruction alignment specifier. PowerPC/Altivec however silently loads/stores...somewhere slightly different, which causes all manner of problems if an unaligned address makes it to the Altivec load/store unit.
Currently, GCC9 on PowerPC ignores the Altivec execution unit entirely when using PSIMD, this appears to be because aligned(N) has N set to at most 4 bytes, but the Altivec unit require 16 byte alignment before it can safely load/store vectors. Increasing N to 16 results in AltiVec instructions being used; however, much code using PSIMD isn't written to align its memory. For example, NNPACK fails the convolution tests.
I think this is because the psimd_(load|store)_ family of instructions aren't alignment-aware and support load/store to native C types, which may not be aligned to the requirements of a vector unit.
I'm not sure what the right solution would be, perhaps some combination of:
adding a platform-recommended alignment so api callers can align their buffers to use
adding some load/store aligned functions
fixing up the current load/store functions to handle unaligned accesses in software on platforms that require it
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Different vector units handle alignment in interesting ways. ARM/NEON supports fixing up unaligned accesses at runtime, or trapping on them depending on the instruction alignment specifier. PowerPC/Altivec however silently loads/stores...somewhere slightly different, which causes all manner of problems if an unaligned address makes it to the Altivec load/store unit.
Currently, GCC9 on PowerPC ignores the Altivec execution unit entirely when using PSIMD, this appears to be because aligned(N) has N set to at most 4 bytes, but the Altivec unit require 16 byte alignment before it can safely load/store vectors. Increasing N to 16 results in AltiVec instructions being used; however, much code using PSIMD isn't written to align its memory. For example, NNPACK fails the convolution tests.
I think this is because the psimd_(load|store)_ family of instructions aren't alignment-aware and support load/store to native C types, which may not be aligned to the requirements of a vector unit.
I'm not sure what the right solution would be, perhaps some combination of:
The text was updated successfully, but these errors were encountered: