Improve DLL/shared library stability by replacing static initializers #16717
gianmichelebertoni
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi llama.cpp team,
I'm writing to propose a couple of small changes to improve the library's stability when it's loaded as a DLL/shared library, particularly from non-C++ environments like the JVM (via JNA/JNI) or perhaps Python (via ctypes).
The Problem: Unreliable Static Initialization in Foreign Hosts
When a C++ DLL is loaded into a "foreign" process (like java.exe), the C++ Runtime's automatic static initialization mechanism can be unreliable. This is a known issue in complex plugin architectures.
The result is that global static objects can be left in an uninitialized state. For llama.cpp, this can lead to crashes when, for example, a static std::mutex is used before its constructor has been called. I have verified this behavior by loading the library from a JVM environment, which results in a memory access violation.
Proposed Solutions
I've identified a couple of areas where the problematic implicit static initialization can be replaced with more robust, explicit patterns that work reliably in any environment.
Case 1: Lazy, Thread-Safe Initialization with std::call_once
Location: ggml/src/ggml-cuda/ggml-cuda.cu lines 3848-3853
Current Code:
Proposed Change:
Rationale: This pattern moves initialization from the unreliable library load time to a thread-safe, on-demand call the first time the resource is needed. It completely bypasses the static initialization problem.
Case 2: Safely Initializing a Global Lock
A static std::mutex is a common point of failure. A more robust approach is to use a function-local static, which is guaranteed to be initialized in a thread-safe manner since C++11 (the "Construct on First Use Idiom").
Location: ggml/src/ggml-threading.cpp
Current Code:
Proposed Change:
Rationale: This modern C++ idiom ensures the mutex is constructed correctly and only once, the first time it's needed. It's clean, efficient, and completely avoids the global static initialization issue. For older compilers, std::call_once could also be used to initialize a static pointer to the mutex. The Windows-specific code is due to a known bug in some CRTs bundled with, for example, recent JVMs (the bugfix came with VS2022 17.10)
DeepWiki problem analysis:
You're experiencing crashes due to static initialization order fiasco (SIOF) across DLL boundaries, specifically with static std::mutex objects in ggml-threading.cpp and ggml-cuda.cu.
Problem Analysis
The crashes occur at these locations:
ggml-threading.cpp: The global ggml_critical_section_mutex at line 4 ggml-threading.cpp:4 is a static mutex that may not be initialized when accessed across DLL boundaries ggml-threading.cpp:6-8 .
ggml-cuda.cu: Multiple static mutexes are used in backend registration functions:
ggml_cuda_lock at line 526 ggml-cuda.cu:526
Static mutex in ggml_backend_cuda_buffer_type() at line 734 ggml-cuda.cu:734-735
Static mutex in ggml_backend_cuda_split_buffer_type() at line 1051 ggml-cuda.cu:1051-1052
Static mutex in ggml_backend_cuda_reg() at line 3851 ggml-cuda.cu:3851-3852
Similar patterns exist in other backends like CANN ggml-cann.cpp:2972-2973 , SYCL ggml-sycl.cpp:4575-4576 , and OpenCL ggml-opencl.cpp:4022-4025 .
Thanks for considering these suggestions. I believe they would make llama.cpp even more robust and versatile.
Best regards,
GM
Beta Was this translation helpful? Give feedback.
All reactions