-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for new frontier TTS model #196
Comments
@bachittle Interesting, thanks! I'll add it to my (already very long list :D ) of my models we should have a look into. |
I know there are a lot of TTS models out there, curious to know what other models you were considering to look at. This one seems the most promising for compromising efficiency and performance, and came out recently. I know coqui tts was the only other one that was as good as this one but they went defunct. Anyways understandable if you don't have the time, I will update this issue if I find any implementations or end up trying to do it myself (I may use this repo as reference 😉) |
seems like another model came out that the llama.cpp team is interested in 👀 ggerganov/llama.cpp#10173 |
Even more models worth exploring. I haven't had time to translate any new models either, maybe I will make it a christmas project. Now there are a lot of options to work with that are even more promising, mostly due to generating coherent text that exceeds 1 minute and excelling in leaderboards. |
If architecture is different, could be its own repository. Just thought I'd mention that this open source TTS is pretty cool. https://github.com/SWivid/F5-TTS. It is also designed to be efficient already, so it can be improved even further if switched to a ggml model. Someone already did an mlx implementation: https://github.com/lucasnewman/f5-tts-mlx.
The text was updated successfully, but these errors were encountered: