🖼️ v2.13.0 - Model gallery edition
Hello folks, Ettore here - I'm happy to announce the v2.13.0 LocalAI release is out, with many features!
Below there is a small breakdown of the hottest features introduced in this release - however - there are many other improvements (especially from the community) as well, so don't miss out the changelog!
Check out the full changelog below for having an overview of all the changes that went in this release (this one is quite packed up).
🖼️ Model gallery
This is the first release with model gallery in the webUI, you can see now a "Model" button in the WebUI which lands now in a selection of models:
You can choose now models between stablediffusion, llama3, tts, embeddings and more! The gallery is growing steadly and being kept up-to-date.
The models are simple YAML files which are hosted in this repository: https://github.com/mudler/LocalAI/tree/master/gallery - you can host your own repository with your model index, or if you want you can contribute to LocalAI.
If you want to contribute adding models, you can by opening up a PR in the gallery
directory: https://github.com/mudler/LocalAI/tree/master/gallery.
Rerankers
I'm excited to introduce a new backend for rerankers
. LocalAI now implements the Jina API (https://jina.ai/reranker/#apiform) as a compatibility layer, and you can use existing Jina clients and point to those to the LocalAI address. Behind the hoods, uses https://github.com/AnswerDotAI/rerankers.
You can test this by using container images with python (this does NOT work with core
images) and a model config file like this, or by installing cross-encoder
from the gallery in the UI:
name: jina-reranker-v1-base-en
backend: rerankers
parameters:
model: cross-encoder
and test it with:
curl http://localhost:8080/v1/rerank \
-H "Content-Type: application/json" \
-d '{
"model": "jina-reranker-v1-base-en",
"query": "Organic skincare products for sensitive skin",
"documents": [
"Eco-friendly kitchenware for modern homes",
"Biodegradable cleaning supplies for eco-conscious consumers",
"Organic cotton baby clothes for sensitive skin",
"Natural organic skincare range for sensitive skin",
"Tech gadgets for smart homes: 2024 edition",
"Sustainable gardening tools and compost solutions",
"Sensitive skin-friendly facial cleansers and toners",
"Organic food wraps and storage solutions",
"All-natural pet food for dogs with allergies",
"Yoga mats made from recycled materials"
],
"top_n": 3
}'
Parler-tts
There is a new backend available for tts now, parler-tts
. It is possible to install and configure the model directly from the gallery. https://github.com/huggingface/parler-tts
🎈 Lot of small improvements behind the scenes!
Thanks to our outstanding community, we have enhanced the performance and stability of LocalAI across various modules. From backend optimizations to front-end adjustments, every tweak helps make LocalAI smoother and more robust.
📣 Spread the word!
First off, a massive thank you (again!) to each and every one of you who've chipped in to squash bugs and suggest cool new features for LocalAI. Your help, kind words, and brilliant ideas are truly appreciated - more than words can say!
And to those of you who've been heros, giving up your own time to help out fellow users on Discord and in our repo, you're absolutely amazing. We couldn't have asked for a better community.
Just so you know, LocalAI doesn't have the luxury of big corporate sponsors behind it. It's all us, folks. So, if you've found value in what we're building together and want to keep the momentum going, consider showing your support. A little shoutout on your favorite social platforms using @LocalAI_OSS and @mudler_it or joining our sponsors can make a big difference.
Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy
Every bit of support, every mention, and every star adds up and helps us keep this ship sailing. Let's keep making LocalAI awesome together!
Thanks a ton, and here's to more exciting times ahead with LocalAI!
What's Changed
Bug fixes 🐛
- fix(autogptq): do not use_triton with qwen-vl by @thiner in #1985
- fix: respect concurrency from parent build parameters when building GRPC by @cryptk in #2023
- ci: fix release pipeline missing dependencies by @mudler in #2025
- fix: remove build path from help text documentation by @cryptk in #2037
- fix: previous CLI rework broke debug logging by @cryptk in #2036
- fix(fncall): fix regression introduced in #1963 by @mudler in #2048
- fix: adjust some sources names to match the naming of their repositories by @cryptk in #2061
- fix: move the GRPC cache generation workflow into it's own concurrency group by @cryptk in #2071
- fix(llama.cpp): set -1 as default for max tokens by @mudler in #2087
- fix(llama.cpp-ggml): fixup
max_tokens
for old backend by @mudler in #2094 - fix missing TrustRemoteCode in OpenVINO model load by @fakezeta in #2114
- Incl ocv pkg for diffsusers utils by @jtwolfe in #2115
Exciting New Features 🎉
- feat: kong cli refactor fixes #1955 by @cryptk in #1974
- feat: add flash-attn in nvidia and rocm envs by @golgeek in #1995
- feat: use tokenizer.apply_chat_template() in vLLM by @golgeek in #1990
- feat(gallery): support ConfigURLs by @mudler in #2012
- fix: dont commit generated files to git by @cryptk in #1993
- feat(parler-tts): Add new backend by @mudler in #2027
- feat(grpc): return consumed token count and update response accordingly by @mudler in #2035
- feat(store): add Golang client by @mudler in #1977
- feat(functions): support models with no grammar, add tests by @mudler in #2068
- refactor(template): isolate and add tests by @mudler in #2069
- feat: fiber logs with zerlog and add trace level by @cryptk in #2082
- models(gallery): add gallery by @mudler in #2078
- Add tensor_parallel_size setting to vllm setting items by @Taikono-Himazin in #2085
- Transformer Backend: Implementing use_tokenizer_template and stop_prompts options by @fakezeta in #2090
- feat: Galleries UI by @mudler in #2104
- Transformers Backend: max_tokens adherence to OpenAI API by @fakezeta in #2108
- Fix cleanup sonarqube findings by @cryptk in #2106
- feat(models-ui): minor visual enhancements by @mudler in #2109
- fix(gallery): show a fake image if no there is no icon by @mudler in #2111
- feat(rerankers): Add new backend, support jina rerankers API by @mudler in #2121
🧠 Models
- models(llama3): add llama3 to embedded models by @mudler in #2074
- feat(gallery): add llama3, hermes, phi-3, and others by @mudler in #2110
- models(gallery): add new models to the gallery by @mudler in #2124
- models(gallery): add more models by @mudler in #2129
📖 Documentation and examples
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #1988
- docs: fix stores link by @adrienbrault in #2044
- AMD/ROCm Documentation update + formatting fix by @jtwolfe in #2100
👒 Dependencies
- deps: Update version of vLLM to add support of Cohere Command_R model in vLLM inference by @holyCowMp3 in #1975
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #1991
- build(deps): bump google.golang.org/protobuf from 1.31.0 to 1.33.0 by @dependabot in #1998
- build(deps): bump github.com/docker/docker from 20.10.7+incompatible to 24.0.9+incompatible by @dependabot in #1999
- build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.1 by @dependabot in #2001
- build(deps): bump actions/checkout from 3 to 4 by @dependabot in #2002
- build(deps): bump actions/setup-go from 4 to 5 by @dependabot in #2003
- build(deps): bump peter-evans/create-pull-request from 5 to 6 by @dependabot in #2005
- build(deps): bump actions/cache from 3 to 4 by @dependabot in #2006
- build(deps): bump actions/upload-artifact from 3 to 4 by @dependabot in #2007
- build(deps): bump github.com/charmbracelet/glamour from 0.6.0 to 0.7.0 by @dependabot in #2004
- build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.4 by @dependabot in #2008
- build(deps): bump github.com/opencontainers/runc from 1.1.5 to 1.1.12 by @dependabot in #2000
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2014
- build(deps): bump the pip group across 4 directories with 8 updates by @dependabot in #2017
- build(deps): bump follow-redirects from 1.15.2 to 1.15.6 in /examples/langchain/langchainjs-localai-example by @dependabot in #2020
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2024
- build(deps): bump softprops/action-gh-release from 1 to 2 by @dependabot in #2039
- build(deps): bump dependabot/fetch-metadata from 1.3.4 to 2.0.0 by @dependabot in #2040
- build(deps): bump github/codeql-action from 2 to 3 by @dependabot in #2041
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2043
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2042
- build(deps): bump the pip group across 4 directories with 8 updates by @dependabot in #2049
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2050
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2060
- build(deps): bump aiohttp from 3.9.2 to 3.9.4 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory by @dependabot in #2067
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2089
- deps(llama.cpp): update, use better model for function call tests by @mudler in #2119
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2122
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2123
- build(deps): bump pydantic from 1.10.7 to 1.10.13 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory by @dependabot in #2125
- feat(swagger): update swagger by @localai-bot in #2128
Other Changes
- ci: try to build on macos14 by @mudler in #2011
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2013
- refactor: backend/service split, channel-based llm flow by @dave-gray101 in #1963
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2028
- fix - correct checkout versions by @dave-gray101 in #2029
- Revert "build(deps): bump the pip group across 4 directories with 8 updates" by @mudler in #2030
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2032
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2033
- fix: action-tmate back to upstream, dead code removal by @dave-gray101 in #2038
- Revert #1963 by @mudler in #2056
- feat: refactor the dynamic json configs for api_keys and external_backends by @cryptk in #2055
- tests: add template tests by @mudler in #2063
- feat: better control of GRPC docker cache by @cryptk in #2070
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2051
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2080
- feat: enable polling configs for systems with broken fsnotify (docker volumes on windows) by @cryptk in #2081
- fix: action-tmate: use connect-timeout-sections and limit-access-to-actor by @dave-gray101 in #2083
- refactor(routes): split routes registration by @mudler in #2077
- fix: action-tmate detached by @dave-gray101 in #2092
- fix: rename fiber entrypoint from http/api to http/app by @mudler in #2096
- fix: typo in models.go by @eltociear in #2099
- Update text-generation.md by @Taikono-Himazin in #2095
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2105
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2113
New Contributors
- @holyCowMp3 made their first contribution in #1975
- @dependabot made their first contribution in #1998
- @adrienbrault made their first contribution in #2044
- @Taikono-Himazin made their first contribution in #2085
- @eltociear made their first contribution in #2099
- @jtwolfe made their first contribution in #2100
Full Changelog: v2.12.4...V2.13.0