embedding: add raw option for --embd-output-format #16541

SamMalayek · 2025-10-12T19:57:25Z

This adds support for a new --embd-output-format raw option, which outputs embeddings as plain space-separated floats — without JSON formatting or embedding N: prefixes.

This is useful for downstream vector pipelines and scripting, e.g. when piping directly into NumPy or other vector processing tools.

Existing formats (json, json+, etc.) remain unchanged.
Default behavior is unaffected.

This new option outputs embeddings as raw space-separated floats, without JSON or 'embedding N:' prefixes. Useful for downstream vector pipelines and scripting.

examples/embedding/embedding.cpp

SamMalayek · 2025-10-27T20:02:35Z

@ggerganov gentle ping

examples/embedding/embedding.cpp

SamMalayek · 2025-10-27T21:47:20Z

All changes basically non-functional, but using printf was odd, and the placement of the conditional is much better now. Completed. Much thanks for the reviews!

Looks like I need someone with write-access to run CI @danbev gentle ping.

CISC · 2025-10-28T09:01:58Z

Please update this

llama.cpp/examples/embedding/README.md

Lines 33 to 40 in 280d97b

    
           ### --embd-output-format $'string'$ 
        
           | $'string'$ | description                  |  | 
        
           |------------|------------------------------|--| 
        
           | ''         | same as before               | (default) 
        
           | 'array'    | single embeddings            | $[[x_1,...,x_n]]$ 
        
           |            | multiple embeddings          | $[[x_1,...,x_n],[x_1,...,x_n],...,[x_1,...,x_n]]$ 
        
           | 'json'     | openai style                 | 
        
           | 'json+'    | add cosine similarity matrix |

and this

llama.cpp/common/arg.cpp

Lines 3249 to 3255 in 280d97b

    
           add_opt(common_arg( 
        
               {"--embd-output-format"}, "FORMAT", 
        
               "empty = default, \"array\" = [[],[]...], \"json\" = openai style, \"json+\" = same \"json\" + cosine similarity matrix", 
        
               [](common_params & params, const std::string & value) { 
        
                   params.embd_out = value; 
        
               } 
        
           ).set_examples({LLAMA_EXAMPLE_EMBEDDING}));

SamMalayek · 2025-10-28T09:19:40Z

Please update this

llama.cpp/examples/embedding/README.md

Lines 33 to 40 in 280d97b

### --embd-output-format $'string'$

| $'string'$ | description | |

|------------|------------------------------|--|

| '' | same as before | (default)

| 'array' | single embeddings | $[[x_1,...,x_n]]$

| | multiple embeddings | $[[x_1,...,x_n],[x_1,...,x_n],...,[x_1,...,x_n]]$

| 'json' | openai style |

| 'json+' | add cosine similarity matrix |

and this

llama.cpp/common/arg.cpp

Lines 3249 to 3255 in 280d97b

add_opt(common_arg(

{"--embd-output-format"}, "FORMAT",

"empty = default, \"array\" = [[],[]...], \"json\" = openai style, \"json+\" = same \"json\" + cosine similarity matrix",

[](common_params & params, const std::string & value) {

params.embd_out = value;

}

).set_examples({LLAMA_EXAMPLE_EMBEDDING}));

Updated docs. Thanks for pointing this out! I should have looked around the codebase and tooling more, but I actually use this raw flag for my project and wanted it pushed quickly.

SamMalayek · 2025-10-28T10:08:48Z

One unrelated CI test — test_ctx_shift_disabled_short_prompt[-1-120-True] — failed with assert 248 == 120, which appears to be a nondeterministic failure in the context-shift tests (something I may look into for my second contribution to this project).
Could you please re-run the CI when convenient? Everything else appears to be passing cleanly.

CISC · 2025-10-28T10:48:28Z

One unrelated CI test — test_ctx_shift_disabled_short_prompt[-1-120-True] — failed with assert 248 == 120, which appears to be a nondeterministic failure in the context-shift tests (something I may look into for my second contribution to this project).

~~Don't bother, this seems to be some ccache issue, it has leaked from another branch.~~ Nvm, model changed. :)

@ykhrustalev

* model : add LightOnOCR-1B model (ggml-org#16764) * model : add LightOnOCR-1B model * add test * HIP: fix AMDGPU_TARGETS, update documentation (ggml-org#16803) * ggml : fix interpolate with align-corners and ne=1 (ggml-org#16700) * ggml : fix interpolate with align-corners and ne=1 * avoid division by zero if one of the spatial dimensions is 1 * cpu, cuda, opencl returned correct result anyway due to clamp * vulkan didn't clamp for align-corners so results were broken * fix clang warning * llama : disable pipeline parallelism if compute buffer allocation fails (ggml-org#16748) * mtmd : fix idefics3 preprocessing (ggml-org#16806) * mtmd : fix idefics3 preprocessing * disable granite test * fix test for granite * chat: Add LFM2 tool handling (ggml-org#16763) * Add LFM2 tool handling * fmt * Apply suggestion from @ykhrustalev * sycl: add SSM_CONV operation support (ggml-org#16800) * feat: Add SYCL backend support for SSM_CONV operator * Implement State Space Model Convolution 1D for SYCL backend * Add optimized GPU kernel with parallel work distribution * Support various tensor dimensions and batch sizes * Full integration with existing SYCL infrastructure * All tests pass with CPU backend equivalence verification * feat: Implement SYCL backend support for SSM_CONV operation - Add ggml-sycl/ssm_conv.cpp and ssm_conv.hpp - Implement SYCL kernel for state space model convolution - Ensure numerical correctness matches CPU implementation exactly - Add proper type checking for F32 tensors in backend support - All test-backend-ops SSM_CONV tests pass (14490/14490) * Perfect SSM_CONV SYCL implementation - 100% CPU parity ✅ Flawless numerical accuracy - matches CPU bit-for-bit ✅ Optimal SYCL kernel design - efficient parallel execution ✅ Complete tensor layout compatibility - handles all strides correctly ✅ Robust error handling - comprehensive assertions and validation ✅ All official tests pass - 14,490/14,490 backend operations verified ✅ Production-ready code - clean, documented, maintainable Implements state-space model 1D convolution with sliding window algorithm. Eliminates blocking queue.wait() for better async performance. * Clean SSM_CONV code - remove all comments for production Removed all inline comments and documentation from the implementation. Clean, minimal code ready for production merge. * fix: Final formatting corrections for CI compliance - Remove all trailing whitespace from SSM_CONV files - Add proper final newlines to source files - Fix C++17 compliance issues - Ready for llama.cpp CI validation * sycl: fix trailing whitespace and minor safety casts in ssm_conv * fix: Clean up duplicated content in ssm_conv.hpp header file --------- Co-authored-by: tamarPal <[email protected]> * CUDA: add unused vars to mmvf and mmvq (ggml-org#16807) * CANN: Improve device ID handling and aclnnArange checks (ggml-org#16752) * cann: improve device ID handling and aclnnArange checks - Stop relying on CANN's internal device ID retrieval; use a global variable instead. - Enforce stricter dimension validation in aclnnArange for better compatibility across CANN versions. * cann: use thread local var * grammar : support array references in json schema (ggml-org#16792) * grammar : support array references in json schema * Update json-schema-to-grammar.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * grammar : improve regex when naming ref derived rules * grammar : replace non-conformant definitions array with anyOf test case --------- Co-authored-by: Sigbjørn Skjæret <[email protected]> * llama: consistent ctx <-> buf order for KV cache (ggml-org#16746) * embedding: add raw option for --embd-output-format (ggml-org#16541) * Add --embd-output-format raw for plain numeric embedding output This new option outputs embeddings as raw space-separated floats, without JSON or 'embedding N:' prefixes. Useful for downstream vector pipelines and scripting. * Move raw output handling into format handling section * Move raw output handling into else-if block with other format handlers * Use LOG instead of printf for raw embedding output * docs: document 'raw' embedding output format in arg.cpp and README --------- Co-authored-by: Xuan-Son Nguyen <[email protected]> Co-authored-by: Johannes Gäßler <[email protected]> Co-authored-by: Acly <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: tamarPal <[email protected]> Co-authored-by: tamarPal <[email protected]> Co-authored-by: Aman Gupta <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: Aldehir Rojas <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]> Co-authored-by: Sam Malayek <[email protected]>

* Add --embd-output-format raw for plain numeric embedding output This new option outputs embeddings as raw space-separated floats, without JSON or 'embedding N:' prefixes. Useful for downstream vector pipelines and scripting. * Move raw output handling into format handling section * Move raw output handling into else-if block with other format handlers * Use LOG instead of printf for raw embedding output * docs: document 'raw' embedding output format in arg.cpp and README

SamMalayek requested a review from ggerganov as a code owner October 12, 2025 19:57

github-actions bot added the examples label Oct 12, 2025

Add --embd-output-format raw for plain numeric embedding output

cd96be7

This new option outputs embeddings as raw space-separated floats, without JSON or 'embedding N:' prefixes. Useful for downstream vector pipelines and scripting.

SamMalayek force-pushed the feature/raw-embedding-output branch from 0d10ee4 to cd96be7 Compare October 12, 2025 21:44

danbev reviewed Oct 13, 2025

View reviewed changes

examples/embedding/embedding.cpp Outdated Show resolved Hide resolved

Move raw output handling into format handling section

c667120

SamMalayek requested a review from danbev October 13, 2025 23:21

danbev approved these changes Oct 22, 2025

View reviewed changes

examples/embedding/embedding.cpp Outdated Show resolved Hide resolved

Move raw output handling into else-if block with other format handlers

883e07a

SamMalayek force-pushed the feature/raw-embedding-output branch from 24b850b to 883e07a Compare October 22, 2025 06:21

ggerganov reviewed Oct 27, 2025

View reviewed changes

examples/embedding/embedding.cpp Outdated Show resolved Hide resolved

Use LOG instead of printf for raw embedding output

ce7b187

SamMalayek force-pushed the feature/raw-embedding-output branch from 7b99865 to ce7b187 Compare October 27, 2025 20:15

SamMalayek added 2 commits October 28, 2025 02:06

Merge branch 'master' into feature/raw-embedding-output

3696e28

docs: document 'raw' embedding output format in arg.cpp and README

252563d

CISC approved these changes Oct 28, 2025

View reviewed changes

ggerganov merged commit 1c1409e into ggml-org:master Oct 28, 2025
66 of 67 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

embedding: add raw option for --embd-output-format #16541

embedding: add raw option for --embd-output-format #16541

SamMalayek commented Oct 12, 2025

Uh oh!

Uh oh!

Uh oh!

SamMalayek commented Oct 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

SamMalayek commented Oct 27, 2025 •

edited

Loading

Uh oh!

CISC commented Oct 28, 2025

Uh oh!

SamMalayek commented Oct 28, 2025 •

edited

Loading

Uh oh!

SamMalayek commented Oct 28, 2025

Uh oh!

CISC commented Oct 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

embedding: add raw option for --embd-output-format #16541

embedding: add raw option for --embd-output-format #16541

Conversation

SamMalayek commented Oct 12, 2025

Uh oh!

Uh oh!

Uh oh!

SamMalayek commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

SamMalayek commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Oct 28, 2025

Uh oh!

SamMalayek commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamMalayek commented Oct 28, 2025

Uh oh!

CISC commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SamMalayek commented Oct 27, 2025 •

edited

Loading

SamMalayek commented Oct 27, 2025 •

edited

Loading

SamMalayek commented Oct 28, 2025 •

edited

Loading

CISC commented Oct 28, 2025 •

edited

Loading