Voice typing for Linux that doesn't suck
(Local) Voice typing on Linux either doesn't work or was made in the past century. How android beats Linux on that for the past 10 years? I can't let that slide.
- Single binary - The backend is Python but I managed to embed everything in a single binary, which means runs anywhere with no setup (well, at least Linux)
- Performance modes - Faster or more accurate modes based on your hardware
- One toggle -
yap togglepauses, resumes, or starts (control via cli) - TCP server - It can serve its state over TCP for status bars, widgets, AI Girlfriends, etc.
- Output file - Pipe transcriptions to other scripts for automation or custom scripts
- Actually private - Literally Whisper (Wow!)
- Configurable - Change models, languages, devices, and all that stuff
curl -sSL https://raw.githubusercontent.com/DeprecatedLuar/yappers-of-linux/main/install.sh | bashOther Install Methods
Manual Install
# Download binary from releases (amd64)
wget https://github.com/DeprecatedLuar/yappers-of-linux/releases/latest/download/yap-linux-amd64
chmod +x yap-linux-amd64
sudo mv yap-linux-amd64 /usr/local/bin/yap
yap start # Auto-installs everything
# Or for arm64
wget https://github.com/DeprecatedLuar/yappers-of-linux/releases/latest/download/yap-linux-arm64
chmod +x yap-linux-arm64
sudo mv yap-linux-arm64 /usr/local/bin/yapBuild From Source
git clone https://github.com/DeprecatedLuar/yappers-of-linux.git
cd yappers-of-linux
go build -o yap cmd/main.go
./yap startSystem Requirements:
python3(3.10+)portaudio19-dev(for mic access)ydotool+ydotoold(Wayland) ORxdotool(X11)
First run takes ~2 minutes to download and set up everything. After that, it's instant.
| Command | Arguments | Description |
|---|---|---|
| start | [options] |
Start voice typing |
| stop | Stop voice typing | |
| toggle | Smart pause/resume/start | |
| pause | Pause listening | |
| resume | Resume listening | |
| output | View output file (aliases: log, cat, show) | |
| models | Show installed models | |
| config | Open config in editor | |
| help | [topic] |
Show help information |
Flags
| Flag | Description |
|---|---|
--model MODEL |
Choose model (tiny/base/small/medium/large) |
--device DEVICE |
Use cpu or cuda |
--language LANG |
Set language (en/es/fr/etc) |
--tcp [PORT] |
Enable TCP server (default: 12322) |
--fast |
Fast mode (int8, less accurate) |
--no-typing |
Print to terminal only, don't type |
yap start # Start listening
yap start --model small # Use better model
yap start --fast # Faster but less accurate
yap toggle # Pause/resume/start
yap stop # StopAvailable Models
Models auto-download on first use:
| Model | Size | Speed | Accuracy |
|---|---|---|---|
| tiny | ~75MB | Fastest | Basic |
| base | ~150MB | Fast | Good |
| small | ~500MB | Balanced | Better |
| medium | ~1.5GB | Slow | Great |
| large | ~3GB | Slowest | Best |
More stuff you can do
yap start --device cuda # Use GPU instead
yap start --language es # Spanish (or any other language)
yap start --tcp # Enable state server on port 12322
yap start --no-typing # Just prints to terminal, doesn't type
yap models # See what models you have
yap config # Open config in your editorConfiguration
Config file lives at ~/.config/yappers-of-linux/config.toml and gets created on first run.
notifications = "start,urgent" # When to notify you
model = "tiny" # Which model to use
device = "cpu" # cpu or cuda
language = "en" # What language you're speaking
fast_mode = false # Trade accuracy for speed
enable_typing = true # Type into active window
output_file = false # Write to output.txt for piping/automationRun yap help config if you want all the details.
State Monitoring (if you're into that)
Want to plug this into your status bar or a desktop widget?
yap start --tcp # Starts on port 12322
nc 127.0.0.1 12322 # Test it outSpits out JSON with the current state. Inspired by Kanata's TCP port.
Output File (for automation)
Enable output_file = true in config to write transcriptions to ~/.config/yappers-of-linux/output.txt.
How it works:
- File is ephemeral - deleted on each
yap start(fresh session) - Each transcription is separated by a blank line (paragraph style)
- View anytime with
yap output(oryap log,yap cat,yap show)
Use cases:
# View the output file
yap output
# Pipe to another script
tail -f ~/.config/yappers-of-linux/output.txt | your-script.sh
# Process with jq/awk/whatever
cat ~/.config/yappers-of-linux/output.txt | process-commands
# Voice-controlled automation
while read line; do handle_command "$line"; done < output.txtConstantly records audio to a 1.5-second circular buffer. When VAD detects speech, it captures that buffer plus whatever you continue saying. After 0.8 seconds of silence, it sends everything to Whisper for transcription and types the result. This means your opening words are never lost.
Filters out hallucinations using confidence thresholds - keyboard clicks and background noise won't turn into random words.
The entire engine (Python + dependencies) is embedded in a single Go binary that self-extracts and manages its own environment. First run takes ~2 minutes to set up, after that it's instant.

