-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync : ggml #12732
sync : ggml #12732
Conversation
@cmdr2 Could you take a look at the arm build failures? |
@ggerganov Sure, taking a look |
tl;dr - Maybe a difference in the strictness of the C++ compiler (vs compiling the C file)? Interestingly, this isn't new behavior. The previous CI runs for this runner also raises this (as a warning) (i.e. without this PR's change) - https://github.com/ggml-org/llama.cpp/actions/runs/14237462102/job/39899497124#step:9:87
But the CI runner now raises this as a warning as well as an error, after these lines are part of C++.
Maybe a difference in the strictness of the C++ compiler (vs compiling the C file)? Continuing to investigate. |
tl;dr - This warning is justified, and worth fixing anyway. Rather than trying to coerce the compiler into letting this through, it's probably worth investigating this (erstwhile) warning. Details: llama.cpp/ggml/src/ggml-cpu/ggml-cpu-impl.h Lines 85 to 99 in 193c3e0
It very clearly DOES NOT define So I'm not sure where
Will look at the arm header file. Continuing to investigate.. |
Interesting, I see a note in our code:
But we're not running on x86, yet we're using @ggerganov @slaren Shouldn't we also explicitly check for x86, along with the msvc check? Before defining |
I'm a bit confused now. Arm neon isn't x86. So why does it set llama.cpp/ggml/src/ggml-cpu/ggml-cpu-impl.h Lines 79 to 95 in 193c3e0
We're also doing this in llama.cpp/ggml/src/ggml-impl.h Lines 314 to 320 in 193c3e0
|
I'm not really sure as well. I think the history of this is #5404 and the references there in. It appears that MSVC on Arm64 had some issues (maybe it still has them) and this required the hacky |
Thanks, digging further in that direction.. |
Submitted a PR for this - ggml-org/ggml#1176 Yeah, it looks like we need to do things in So maybe the solution is to just tell the compiler that it's okay (by using I have no idea about ARM or SIMD code, so please feel free to suggest alternatives :) Thanks! |
8339981
to
6232cee
Compare
@ggerganov What if we keep the arm header include the same (i.e. if not MUSA), but continue allowing the fp32 conversion function declarations for MUSA? Basically, bringing back these lines: e638450#diff-1f56ac82eed1293d4aa7c35aef0bc19e831cdb24dcb6af43582143936eb7eae4L19-L25 And removing Thanks |
Or maybe that's completely unrelated to the CI failure? |
Yes, it is due to the CUDA BF16 change. |
The MUSA build is simple to fix, but I am not able to fix the HIP build. I think HIP does not support @JohannesGaessler Do you have suggestions how to fix? |
I pushed a fix for the HIP compilation failure. From what I can tell the problem is that the HIP header does not define I will not be able to test the code on actual AMD hardware until in a few hours. |
Sorry, I missed the error messages about the HIP types. In
It also lists the following types as supported for GEMM:
So it should be possible to use the code for HIP by just changing the vendor headers in ggml. |
I had this change, which fixes MUSA: 5ef588b But HIP still fails like this: https://github.com/ggml-org/llama.cpp/actions/runs/14306928482/job/40092833687#step:6:128
Not sure why the |
Because they're calling the type |
Ok, let me push a fix now. |
b65ac0b
to
709fa72
Compare
… (ggml/1167) * cpu: refactor SIMD mappings and vectorized op functions into separate files * Fix warning for ggml_float to float * Fix warnings * cpu: move all the operations (except mul_mat) to a separate c++ file * fix whitespace * Update ggml/src/ggml-cpu/vec.h Co-authored-by: Diego Devesa <[email protected]> * Fix PR comments - use GGML_UNUSED, use cassert in ops.cpp * Reverse the order of import for ops.h and vec.h, to match what was present in ggml-cpu.c previously --------- Co-authored-by: Diego Devesa <[email protected]>
* add bf16 support * use convert_from_bf16_cuda instead of convert_unary_cuda for f32 * revert 7ec5085 * move functionality into convert_unary with constexpr
* ggml : simlpify Arm fp16 CPU logic ggml-ci * cont : bring back CUDA/MUSA checks ggml-ci
ggml-ci
709fa72
to
92d7d4d
Compare
Fix ggml-org#12732: * Remove incorrect inclusion of "arm_neon.h" for CUDA versions ≥ 12
Fix ggml-org#12732: * Remove incorrect inclusion of "arm_neon.h" for CUDA versions ≥ 12
No description provided.