Skip to content

Commit 5dcb8cc

Browse files
authored
Use siphash on architectures that support misaligned accesses (#825)
Python uses siphash (siphash13 in 3.11+, siphash24 on older versions) as the default internal hashing algorithm, but only on architectures that support misaligned accesses, i.e., reads/writes of integers from a memory address that is not a round multiple of the integer size. On other architectures it uses fnv, which is not supported by Numba and raises a warning. The distinction between architectures is done by a configure-time code execution check, which is not supported on our cross builds, including on our x86_64_vN microarchitecture builds (see #599), so Python defaults to assuming it is not supported. Hard-code a list of platforms that are known to support misaligned accesses just fine. Credit to https://blog.vitlabuda.cz/2025/01/22/unaligned-memory-access-on-various-cpu-architectures.html for pointing out that the Linux kernel has this pretty well documented in Kconfig. Note that loongarch and riscv have optional support for misaligned access, and it's quite possible that the hardware that people actually use have support for them (or that we are targeting a limited hardware profile anyway for some reason that implies support for misaligned access). I've left them out for now but we can add them later. Fixes #683.
1 parent bd52271 commit 5dcb8cc

File tree

2 files changed

+19
-1
lines changed

2 files changed

+19
-1
lines changed

cpython-unix/build-cpython.sh

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -673,7 +673,19 @@ if [ -n "${CROSS_COMPILING}" ]; then
673673
# default on relatively modern compilers.
674674
CONFIGURE_FLAGS="${CONFIGURE_FLAGS} ac_cv_pthread=yes"
675675

676-
# TODO: There are probably more of these, see #399.
676+
# Also, it cannot detect whether misaligned memory accesses should
677+
# be avoided, and conservatively defaults to yes, which makes it
678+
# pick the 'fnv' hash instead of 'siphash', which numba does not
679+
# like (#683, see also comment in cpython/configure.ac). These
680+
# answers are taken from the Linux kernel source's Kconfig files,
681+
# search for HAVE_EFFICIENT_UNALIGNED_ACCESS.
682+
case "${TARGET_TRIPLE}" in
683+
arm64*|aarch64*|armv7*|thumb7*|ppc64*|s390*|x86*)
684+
CONFIGURE_FLAGS="${CONFIGURE_FLAGS} ac_cv_aligned_required=no"
685+
;;
686+
esac
687+
688+
# TODO: There are probably more of these, see #599.
677689
fi
678690

679691
# We patched configure.ac above. Reflect those changes.

src/verify_distribution.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,12 @@ def say_hi(self):
249249
root = tk.Tk()
250250
Application(master=root)
251251

252+
def test_hash_algorithm(self):
253+
self.assertTrue(
254+
sys.hash_info.algorithm.startswith("siphash"),
255+
msg=f"{sys.hash_info.algorithm=!r} is not siphash",
256+
)
257+
252258

253259
if __name__ == "__main__":
254260
unittest.main()

0 commit comments

Comments
 (0)