Skip to content

MTP & Desktop Support + More #101

@hg0428

Description

@hg0428

In the Readme you said you are mainly only targeting iOS/Android.
What about MacOS, Windows, and Linux?
I am attempting to make a single chat app that can work across platforms including both mobile and desktop, so having support for all platforms is a key part of that.

Additionally I would love native builtin support for embedding models here as well, a key part of various RAG pipelines.
Also I would like to see support for MTP, which was recently added to Llama.cpp. This can help increase speeds with speculative decoding which could be of great help on mobile.

And another thing that I would really love to have native support for is batching multiple sequences at once for higher parallelism, as well as potentially (though less urgent) server-style continuous batching.

I believe most of these things are supported within Llama.cpp itself and just need to be exposed and optimized for this library.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions