You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A local voice gateway that runs the full ASR → LLM → TTS pipeline on Apple Silicon using MLX models. Designed for the Clawfinger Android app — the phone connects via ADB reverse port forwarding, keeping everything on localhost with zero cloud dependencies.
How It Works
Phone sends audio over HTTP to the gateway
ASR (Parakeet via mlx_audio) transcribes the caller's speech
LLM (Qwen 1.5B via mlx-lm, or any OpenAI-compatible endpoint) generates a reply
TTS (Kokoro via mlx_audio for English, Piper Thorsten for German) synthesizes the reply to speech
Audio is returned to the phone as base64 WAV
The gateway also provides:
Control Center UI — web dashboard for live call monitoring, instruction editing, LLM parameter tuning, TTS voice selection, and session logs
Agent Interface — WebSocket protocol for external agents to observe calls, take over LLM generation, inject TTS messages, inject context knowledge, and query call state
TTS Voice Selection — Kokoro voices for English (50+), Piper Thorsten voices for German (10 incl. emotional variants), with live preview and language switching