Skip to content

Compile bug: #151

@Doorman11991

Description

@Doorman11991

Git commit

Patch a Vulkan regression (only needed for build b9418-ish)

At the time of writing, the default branch had merged a chunk of upstream Vulkan changes that dropped SET_ROWS support for the turbo KV cache types. Loading a model with --cache-type-v turbo2 (or turbo3 / turbo4) crashes early with:

pre-allocated tensor (cache_v_l3 (view)) in a buffer (Vulkan0) that cannot run the operation (SET_ROWS)

If you see this, two files need a small patch.

4.1 ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp

Add the turbo and TQ types to the set_rows_* shader generation loop:

// before
for (std::string t : {"f32", "f16", "bf16", "q1_0", "q4_0", "q4_1", "q5_0", "q5_1", "q8_0", "iq4_nl"}) {
    string_to_spv("set_rows_" + t + "_i32", ...);
    string_to_spv("set_rows_" + t + "_i64", ...);
}

// after
for (std::string t : {"f32", "f16", "bf16", "q1_0", "q4_0", "q4_1", "q5_0", "q5_1", "q8_0", "iq4_nl",
                       "turbo2_0", "turbo3_0", "turbo4_0", "tq4_1s"}) {
    string_to_spv("set_rows_" + t + "_i32", ...);
    string_to_spv("set_rows_" + t + "_i64", ...);
}

The compute shader (copy_to_quant.comp) and types.glsl already handle these types; only the generator loop was missing them.

4.2 ggml/src/ggml-vulkan/ggml-vulkan.cpp

Two edits:

a) Register the pipelines. Inside the SET_ROWS(itype) macro that lists pipeline_set_rows ## itype [GGML_TYPE_*], append four more lines after the IQ4_NL registration so TURBO2_0, TURBO3_0, TURBO4_0, and TQ4_1S are also wired up. Match the existing line shape exactly (just substitute the type name everywhere).

b) Tell supports_op the backend can do it. Find the case GGML_OP_SET_ROWS block in the supports_op switch and add the same four cases:

case GGML_OP_SET_ROWS:
    {
        switch (op->type) {
            case GGML_TYPE_F32:
            case GGML_TYPE_F16:
            case GGML_TYPE_BF16:
            case GGML_TYPE_Q1_0:
            case GGML_TYPE_Q4_0:
            case GGML_TYPE_Q4_1:
            case GGML_TYPE_Q5_0:
            case GGML_TYPE_Q5_1:
            case GGML_TYPE_Q8_0:
            case GGML_TYPE_IQ4_NL:
            case GGML_TYPE_TURBO2_0:    // added
            case GGML_TYPE_TURBO3_0:    // added
            case GGML_TYPE_TURBO4_0:    // added
            case GGML_TYPE_TQ4_1S:      // added
                return true;
            default:
                return false;
        }
    }

After both files are edited, the next cmake --build will recompile only the affected shaders and the Vulkan backend (~1 minute on a recent machine).

If your clone is a few weeks newer or older this regression might already be fixed; you can tell because building without the patch will not produce the SET_ROWS runtime error.

Operating systems

Windows

GGML backends

Vulkan

Problem description & steps to reproduce

Compiling vulkan shaders using newest build leaves set_rows broken.

First Bad Commit

No response

Compile command

cmake -S . -B build -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DGGML_VULKAN=ON -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_TOOLS=ON && cmake --build build --config Release -j

Relevant log output

no logs srry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions