Git commit
Patch a Vulkan regression (only needed for build b9418-ish)
At the time of writing, the default branch had merged a chunk of upstream Vulkan changes that dropped SET_ROWS support for the turbo KV cache types. Loading a model with --cache-type-v turbo2 (or turbo3 / turbo4) crashes early with:
pre-allocated tensor (cache_v_l3 (view)) in a buffer (Vulkan0) that cannot run the operation (SET_ROWS)
If you see this, two files need a small patch.
4.1 ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
Add the turbo and TQ types to the set_rows_* shader generation loop:
// before
for (std::string t : {"f32", "f16", "bf16", "q1_0", "q4_0", "q4_1", "q5_0", "q5_1", "q8_0", "iq4_nl"}) {
string_to_spv("set_rows_" + t + "_i32", ...);
string_to_spv("set_rows_" + t + "_i64", ...);
}
// after
for (std::string t : {"f32", "f16", "bf16", "q1_0", "q4_0", "q4_1", "q5_0", "q5_1", "q8_0", "iq4_nl",
"turbo2_0", "turbo3_0", "turbo4_0", "tq4_1s"}) {
string_to_spv("set_rows_" + t + "_i32", ...);
string_to_spv("set_rows_" + t + "_i64", ...);
}
The compute shader (copy_to_quant.comp) and types.glsl already handle these types; only the generator loop was missing them.
4.2 ggml/src/ggml-vulkan/ggml-vulkan.cpp
Two edits:
a) Register the pipelines. Inside the SET_ROWS(itype) macro that lists pipeline_set_rows ## itype [GGML_TYPE_*], append four more lines after the IQ4_NL registration so TURBO2_0, TURBO3_0, TURBO4_0, and TQ4_1S are also wired up. Match the existing line shape exactly (just substitute the type name everywhere).
b) Tell supports_op the backend can do it. Find the case GGML_OP_SET_ROWS block in the supports_op switch and add the same four cases:
case GGML_OP_SET_ROWS:
{
switch (op->type) {
case GGML_TYPE_F32:
case GGML_TYPE_F16:
case GGML_TYPE_BF16:
case GGML_TYPE_Q1_0:
case GGML_TYPE_Q4_0:
case GGML_TYPE_Q4_1:
case GGML_TYPE_Q5_0:
case GGML_TYPE_Q5_1:
case GGML_TYPE_Q8_0:
case GGML_TYPE_IQ4_NL:
case GGML_TYPE_TURBO2_0: // added
case GGML_TYPE_TURBO3_0: // added
case GGML_TYPE_TURBO4_0: // added
case GGML_TYPE_TQ4_1S: // added
return true;
default:
return false;
}
}
After both files are edited, the next cmake --build will recompile only the affected shaders and the Vulkan backend (~1 minute on a recent machine).
If your clone is a few weeks newer or older this regression might already be fixed; you can tell because building without the patch will not produce the SET_ROWS runtime error.
Operating systems
Windows
GGML backends
Vulkan
Problem description & steps to reproduce
Compiling vulkan shaders using newest build leaves set_rows broken.
First Bad Commit
No response
Compile command
cmake -S . -B build -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DGGML_VULKAN=ON -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_TOOLS=ON && cmake --build build --config Release -j
Relevant log output
Git commit
Patch a Vulkan regression (only needed for build b9418-ish)
At the time of writing, the default branch had merged a chunk of upstream Vulkan changes that dropped
SET_ROWSsupport for the turbo KV cache types. Loading a model with--cache-type-v turbo2(orturbo3/turbo4) crashes early with:If you see this, two files need a small patch.
4.1
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cppAdd the turbo and TQ types to the
set_rows_*shader generation loop:The compute shader (
copy_to_quant.comp) andtypes.glslalready handle these types; only the generator loop was missing them.4.2
ggml/src/ggml-vulkan/ggml-vulkan.cppTwo edits:
a) Register the pipelines. Inside the
SET_ROWS(itype)macro that listspipeline_set_rows ## itype [GGML_TYPE_*], append four more lines after theIQ4_NLregistration soTURBO2_0,TURBO3_0,TURBO4_0, andTQ4_1Sare also wired up. Match the existing line shape exactly (just substitute the type name everywhere).b) Tell
supports_opthe backend can do it. Find thecase GGML_OP_SET_ROWSblock in thesupports_opswitch and add the same four cases:After both files are edited, the next
cmake --buildwill recompile only the affected shaders and the Vulkan backend (~1 minute on a recent machine).If your clone is a few weeks newer or older this regression might already be fixed; you can tell because building without the patch will not produce the SET_ROWS runtime error.
Operating systems
Windows
GGML backends
Vulkan
Problem description & steps to reproduce
Compiling vulkan shaders using newest build leaves set_rows broken.
First Bad Commit
No response
Compile command
Relevant log output