Skip to content

Intel openAPI (openMKL) CMAKE_ARGS enabled? #1133

Open
@emulated24

Description

@emulated24

Expected Behavior

Pass the oneMKL flags to CMAKE_ARGS and installing llama-cpp-python via pip should finish successfully as the flags are supported by llama.cpp:
https://github.com/ggerganov/llama.cpp#intel-onemkl

Current Behavior

Passing the CMAKE_ARGS flags to pip installation produces error:

CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON" FORCE_CMAKE=1 \ pip install llama-cpp-python

[20/22] : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : FAILED: vendor/llama.cpp/examples/llava/libllava.so : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o: file not recognized: file format not recognized icpx: error: linker command failed with exit code 1 (use -v to see invocation)

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            13th Gen Intel(R) Core(TM) i5-1340P
    CPU family:          6
    Model:               186
    Thread(s) per core:  1
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            4377.60
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_
                         known_freq pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibr
                         s ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetb
                         v1 xsaves avx_vnni arat vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    48 MiB (12 instances)
  L3:                    16 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected
  • Operating System, e.g. for Linux:

$ uname -a

Linux ladex 6.7.2-1.el9.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 25 23:07:22 EST 2024 x86_64 x86_64 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
$ python3 --version
$ make --version
$ g++ --version
$ cmake --version
$ icx --version
Python 3.11.7
GNU Make 4.3
g++ (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7)
cmake version 3.20.2
Intel(R) oneAPI DPC++/C++ Compiler 2024.0.2 (2024.0.2.20231213)

Building llama.cpp directly with oneAPI works fine and performs 2x better than with BLIS and ~2.8x better than clean (not customized) build via "pip install llama-cpp-python".

Commands used to build llama.cpp direcly with oneAPI:

source /opt/intel/oneapi/setvars.sh
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON
cmake --build . --config Release

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbuild

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions