MTP & Desktop Support + More

In the Readme you said you are mainly only targeting iOS/Android.
What about MacOS, Windows, and Linux?
I am attempting to make a single chat app that can work across platforms including both mobile and desktop, so having support for all platforms is a key part of that.

Additionally I would love native builtin support for embedding models here as well, a key part of various RAG pipelines.
Also I would like to see support for MTP, which was recently added to Llama.cpp. This can help increase speeds with speculative decoding which could be of great help on mobile.

And another thing that I would really love to have native support for is batching multiple sequences at once for higher parallelism, as well as potentially (though less urgent) server-style continuous batching.

I believe most of these things are supported within Llama.cpp itself and just need to be exposed and optimized for this library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTP & Desktop Support + More #101

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MTP & Desktop Support + More #101

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions