Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
39a8a26
feat: added the skeleton structure of the x86 module
madhav-madhusoodanan Aug 2, 2025
acf67c2
feat: added the XML intrinsic parser for x86
madhav-madhusoodanan Aug 2, 2025
21798b1
feat: updated intrinsics creation
madhav-madhusoodanan Aug 3, 2025
bdc5801
feat: update building C code for x86 architecture.
madhav-madhusoodanan Aug 3, 2025
adcd629
fix: code cleanup
madhav-madhusoodanan Aug 3, 2025
23b3ff9
chore: added Regex crate, updated the structure of X86IntrinsicType
madhav-madhusoodanan Aug 5, 2025
b3292c3
feat: implemented build_rust_file of `x86` module
madhav-madhusoodanan Aug 5, 2025
75ff313
feat: implemented compare_outputs of `x86` module
madhav-madhusoodanan Aug 5, 2025
1c82e7f
feat: implement `print_result_c` for `Intrinsic<X86IntrinsicType>`
madhav-madhusoodanan Aug 5, 2025
3fddae0
feat: Added x86 to CI pipeline
madhav-madhusoodanan Aug 5, 2025
08541d8
fix: update arch flags being sent to the x86 compilation command
madhav-madhusoodanan Aug 5, 2025
ccdba42
fix: set default value for varname and type fields of the
madhav-madhusoodanan Aug 5, 2025
a447a1d
fix: correcting semantical logic for setting vec_len
madhav-madhusoodanan Aug 5, 2025
781135b
fix: more support for Mask types
madhav-madhusoodanan Sep 5, 2025
a232857
fix: remove unused imports
madhav-madhusoodanan Sep 6, 2025
71d4636
feat: implemented print_result_c in the case the target type is
madhav-madhusoodanan Sep 7, 2025
372d615
feat: implemented get_lane_function for x86
madhav-madhusoodanan Sep 7, 2025
a3de32e
chore: update c_prefix for mask and print_result_c for vector type
madhav-madhusoodanan Sep 7, 2025
13733a7
feat: handled extraction for 64-bit vector elements
madhav-madhusoodanan Sep 8, 2025
dcca506
feat: add 8x8 case for get_lane_function for 64-bit vector
madhav-madhusoodanan Sep 8, 2025
180d6f0
debug: printing self incase print_result_c fails.
madhav-madhusoodanan Sep 9, 2025
838e925
chore: update x86 module, removed intrinsicDefinition trait, formatting
madhav-madhusoodanan Sep 10, 2025
e3ad0e4
fixed errors that caused errors with cpp file generation (un-handled
madhav-madhusoodanan Sep 13, 2025
2e05d59
feat: correcting errors with generated C artifacts
madhav-madhusoodanan Sep 14, 2025
e977009
fix: vec_len -> simd_len (an error was present due to setting vec_len…
madhav-madhusoodanan Sep 14, 2025
c1d9710
chore: revert default target
madhav-madhusoodanan Sep 16, 2025
91eb331
chore: adding comments about memory alignment of variables and bash s…
madhav-madhusoodanan Sep 17, 2025
1cf7c54
chore: add compilation flags
madhav-madhusoodanan Sep 17, 2025
5b6f2f1
chore: add better error handling when writing and compiling mod_{i}.cpp,
madhav-madhusoodanan Sep 18, 2025
d9ff321
feat: Fixed FP16 errors, made the loading function generation more
madhav-madhusoodanan Sep 20, 2025
48c6ed1
chore: Ensuring "const" appears for constant arguments to intrinsics.
madhav-madhusoodanan Sep 24, 2025
45d097f
chore: allowing cast() function to allow implicity type conversion for
madhav-madhusoodanan Sep 24, 2025
5553fdd
feat: matching the expected number of elements for array to load
madhav-madhusoodanan Sep 24, 2025
39a0e45
feat: updated with debug printing and ostream implementation for vector
madhav-madhusoodanan Sep 24, 2025
2742d33
chore: corrected the legal range of values for constrained arguments
madhav-madhusoodanan Sep 24, 2025
18caf69
feat: filter for duplicates in the definition of intrinsics
madhav-madhusoodanan Sep 24, 2025
6702469
chore: vector types cannot be the type of an individual element in an
madhav-madhusoodanan Sep 24, 2025
4cba53a
chore: accomodate for `immwidth` field for constraints
madhav-madhusoodanan Sep 24, 2025
a428415
feat: defined more load functions that are natively not defined (such as
madhav-madhusoodanan Sep 24, 2025
c57e9a2
chore: corrected the imm-width correction location for _mm_mpsadbw_epu8
madhav-madhusoodanan Sep 24, 2025
f58777f
feat: added exclusion list to intrinsic-test CI pipeline
madhav-madhusoodanan Sep 24, 2025
d9be63f
chore: clean up unused variables
madhav-madhusoodanan Sep 24, 2025
d6f7ca8
feat: moved cast<T1, T2> to architecture-specific definitions
madhav-madhusoodanan Sep 27, 2025
ff776c4
fix: remove extra brackets for cast definition in arm/config.rs
madhav-madhusoodanan Sep 27, 2025
1047f81
make `std::ostream& operator<<(std::ostream& os, float16_t value);`
madhav-madhusoodanan Sep 27, 2025
cc28ab0
feat: add missing_x86.txt to filter out intrinsics that cannot be tested
madhav-madhusoodanan Sep 27, 2025
153191f
feat: added custom helper functions (that helped load intrinsic
madhav-madhusoodanan Sep 27, 2025
527addd
chore: add more compiler flags for compiling x86 intrinsics in C++
madhav-madhusoodanan Sep 28, 2025
525249f
chore: add verbose cli option to C++ compiler
madhav-madhusoodanan Sep 28, 2025
d9f8159
feat: add clang to dockerfile and change clang++-19 to clang++
madhav-madhusoodanan Sep 28, 2025
3f3e3c4
fix: add `libstdc++-dev` to fix `iostream not found` error
madhav-madhusoodanan Sep 28, 2025
72750f7
fix: making compilation step run one by one to prevent the process from
madhav-madhusoodanan Sep 29, 2025
f188d95
feat: attempting compilation of smaller chunks for faster parallel
madhav-madhusoodanan Sep 29, 2025
4cb1470
feat: add c_programs to PATH and increase chunk size to 400
madhav-madhusoodanan Sep 30, 2025
41263b4
feat: display __mmask8 values so that non-utf8 values are not displayed
madhav-madhusoodanan Oct 2, 2025
b68d557
feat: add formatting for __m128i, __m256i, __m512i types that is similar
madhav-madhusoodanan Oct 3, 2025
e139299
feat: make the debug_i16 into a generic debug_as function that adapts to
madhav-madhusoodanan Oct 5, 2025
c2294ff
feat: casting the results of the lane function by preserving the bits
madhav-madhusoodanan Oct 8, 2025
717a3ad
fix: update the display of uint8_t type in C++
madhav-madhusoodanan Oct 8, 2025
88921ac
Explicitly cast bits instead of allowing C++ to automatically cast the
madhav-madhusoodanan Oct 8, 2025
ab9e103
feat: update cast<> function to reduce spurious cast functions (cases
madhav-madhusoodanan Oct 9, 2025
b3acb49
Feat: Compile C++ testfiles using C++23 standard
madhav-madhusoodanan Oct 9, 2025
cc31993
Feat: allow downcasting (useful for certain cases where uint32_t needs
madhav-madhusoodanan Oct 9, 2025
400be7f
feat: explicitly casting the result of the lane function to unsigned
madhav-madhusoodanan Oct 10, 2025
52091b5
feat: updated exclusion list with more intrinsics, that can be fixed
madhav-madhusoodanan Oct 11, 2025
7b80a1f
chore: remove x86-intel.xml from `stdarch-verify` crate
madhav-madhusoodanan Oct 15, 2025
11375b9
chore: move from random testing to testing only the first N intrinsics
madhav-madhusoodanan Oct 15, 2025
2ecda3b
chore: convert println! logging to trace! logging during compilation
madhav-madhusoodanan Oct 15, 2025
106b510
feat: code cleanup 1. changing array bracket prefixes from &'static str
madhav-madhusoodanan Oct 15, 2025
19a6292
chore: make names in config.rs files uniform across architectures
madhav-madhusoodanan Oct 16, 2025
a80eff1
fix: remove the PATH update in ci/run.sh
madhav-madhusoodanan Oct 17, 2025
ce179da
feat: fixing Rust's print mechanism for _mm512_conj_pch
madhav-madhusoodanan Oct 23, 2025
41357a0
feat: added x86_64-unknown-linux-gnu to the test matrix of
madhav-madhusoodanan Oct 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,7 @@ jobs:
- aarch64_be-unknown-linux-gnu
- armv7-unknown-linux-gnueabihf
- arm-unknown-linux-gnueabihf
- x86_64-unknown-linux-gnu
profile: [dev, release]
include:
- target: aarch64_be-unknown-linux-gnu
Expand Down
53 changes: 52 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion ci/docker/x86_64-unknown-linux-gnu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
make \
ca-certificates \
wget \
xz-utils
xz-utils \
clang \
libstdc++-14-dev \
build-essential \
lld

RUN wget http://ci-mirrors.rust-lang.org/stdarch/sde-external-9.58.0-2025-06-16-lin.tar.xz -O sde.tar.xz
RUN mkdir intel-sde
Expand Down
24 changes: 24 additions & 0 deletions ci/intrinsic-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,14 @@ case ${TARGET} in
TEST_CXX_COMPILER="clang++"
TEST_RUNNER="${CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_RUNNER}"
;;

x86_64-unknown-linux-gnu*)
TEST_CPPFLAGS="-fuse-ld=lld -I/usr/include/x86_64-linux-gnu/"
TEST_CXX_COMPILER="clang++"
TEST_RUNNER="${CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER}"
TEST_SKIP_INTRINSICS=crates/intrinsic-test/missing_x86.txt
TEST_SAMPLE_INTRINSICS_PERCENTAGE=5
;;
*)
;;

Expand Down Expand Up @@ -94,6 +102,22 @@ case "${TARGET}" in
--linker "${CARGO_TARGET_AARCH64_BE_UNKNOWN_LINUX_GNU_LINKER}" \
--cxx-toolchain-dir "${AARCH64_BE_TOOLCHAIN}"
;;

x86_64-unknown-linux-gnu*)
# `CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER` is not necessary for `intrinsic-test`
# because the binary needs to run directly on the host.
# Hence the use of `env -u`.
env -u CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER \
CPPFLAGS="${TEST_CPPFLAGS}" RUSTFLAGS="${HOST_RUSTFLAGS}" \
RUST_LOG=warn RUST_BACKTRACE=1 \
cargo run "${INTRINSIC_TEST}" "${PROFILE}" \
--bin intrinsic-test -- intrinsics_data/x86-intel.xml \
--runner "${TEST_RUNNER}" \
--skip "${TEST_SKIP_INTRINSICS}" \
--cppcompiler "${TEST_CXX_COMPILER}" \
--target "${TARGET}" \
--sample-percentage "${TEST_SAMPLE_INTRINSICS_PERCENTAGE}"
;;
*)
;;
esac
3 changes: 3 additions & 0 deletions crates/intrinsic-test/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,6 @@ pretty_env_logger = "0.5.0"
rayon = "1.5.0"
diff = "0.1.12"
itertools = "0.14.0"
quick-xml = { version = "0.37.5", features = ["serialize", "overlapped-lists"] }
serde-xml-rs = "0.8.0"
regex = "1.11.1"
Loading