-
Notifications
You must be signed in to change notification settings - Fork 109
Implement vec256/512 #3966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement vec256/512 #3966
Conversation
Review:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed the directories assigned to me and they look fine, modulo my comments above - mainly I think we need to document clearly somewhere all the bits that are missing before we can consider moving this out of beta.
The |
New review assignment: I will read the Cmm parts, @xclerc will read the remainder of the backend. |
I've now read |
Responded to review - now need to fix the asan job & some failure in stack checks + effects |
The stack checks failure is due to #4078; also added a wrong asan case to fix in a following pr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only reviewed the runtime side of this. It's broadly OK; various minor suggestions. In addition to these, I found some of the macrology in amd64.S hard to follow, so have made a suggested improvement in a commit here: https://github.com/NickBarnes/oxcaml/tree/avx-macrology
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime stuff looks good to me now.
Adds support for 256 and 512 bit vectors throughout the compiler. Generally has exactly the same semantics as 128 bit vectors, but larger. About half of the diff (+5k loc) is from enabling AVX/AVX2 in
simdgen
.simd_beta
. The basic tests for 128-bit have been duplicated for 256, but none of the operations in the other tests are implemented yet. AVX and AVX2 architecture extensions have been added and are enabled by default, since we already enable other Haswell extensions.simd_alpha
. They do not yet have tests and their usage will trigger a fatal error in emit. The AVX512F extension is also disabled by default.I think the only changes that need careful review are those related to preserving/aligning wide registers (mainly
amd64.S
/emit
):caml_c_call_stack_args_avx{512}
, which aligns the C stack before copying the arguments from the OCaml stack. We also do this in runtime4.For later PRs: