Replace Google's cloud voice typing with NVIDIA Parakeet TDT 0.6B v2 running fully on your phone. Tap the mic, speak, text appears. Works in airplane mode. Nothing leaves your device.
- Engine: NVIDIA Parakeet TDT 0.6B v2 (int8 quantized) via sherpa-onnx
- Auto-stop: Silero VAD detects when you pause — no second tap needed
- Two ways to use: Gboard's mic button (recommended) or as a full voice keyboard
- Tested on: Samsung Galaxy S24 Ultra (Snapdragon 8 Gen 3). Should work on any modern arm64 Android 7+ device.
Grab app-debug.apk from the latest release.
Easy way (no PC): Email the APK to yourself or drop it in Google Drive → open it on your phone → tap Install (you may need to allow installs from your browser/file manager once).
ADB way:
adb install -r app-debug.apk
- Open Parakeet Voice from the app drawer.
- Tap Grant microphone permission → allow.
- Tap Download model (~600 MB) — Wi-Fi recommended. One-time download from Hugging Face into the app's private storage.
- Pick one or both of the two ways to use it (below).
You keep Gboard for typing. Only voice input changes — Gboard's mic now transcribes locally through Parakeet instead of Google's cloud.
In the app, tap Set as default voice input → in Voice input / Speech recognition, pick Parakeet Voice.
Where to find this on Samsung One UI: Settings → General management → Keyboard list and default → Default voice input → Parakeet Voice.
That's it. Open any text field, tap Gboard's mic icon, speak — Parakeet transcribes on-device.
Use Parakeet as its own keyboard with a big green mic button. Useful if you don't have Gboard or want a dedicated voice-only mode.
In the app, tap Enable Parakeet Voice in system settings → toggle it on. Then tap Pick Parakeet Voice as input method. In any text field:
- Open the keyboard switcher → pick Parakeet Voice.
- Tap the green mic → speak → pause (~600 ms) → text auto-commits.
- Tap mic again mid-utterance to force-stop early.
- Tap the small icon top-right to switch back to Gboard.
Turn on airplane mode, then dictate. If you get text, it's local. (Google's cloud STT cannot work without network, and unless you previously installed Gboard's optional offline pack, it can't either.)
Requires Android SDK with platform-34 + build-tools 34.0.0 and JDK 17.
echo "sdk.dir=/path/to/android-sdk" > local.properties
./gradlew assembleDebug
# Output: app/build/outputs/apk/debug/app-debug.apkThe APK currently builds for arm64-v8a only (~21 MB). To support other
architectures, add the ABIs to abiFilters in app/build.gradle and copy the
matching .so files from the sherpa-onnx Android release
into app/src/main/jniLibs/<abi>/.
app/src/main/java/com/parakeetime/ime/ParakeetInputMethodService.kt— the full-screen voice IME (mic UI, AudioRecord, sherpa-onnx call,commitText).app/src/main/java/com/parakeetime/ime/ParakeetRecognitionService.kt— theandroid.speech.RecognitionServiceso Gboard's mic routes here.app/src/main/java/com/parakeetime/ime/ModelManager.kt— first-run download with HTTP resume + lazyOfflineRecognizercache.app/src/main/java/com/parakeetime/ime/MainActivity.kt— setup screen.app/src/main/java/com/k2fsa/sherpa/onnx/*.kt— vendored sherpa-onnx Kotlin wrappers (from k2-fsa/sherpa-onnx@master).app/src/main/jniLibs/arm64-v8a/*.so— sherpa-onnx + ONNX Runtime native libs (~34 MB), from sherpa-onnx v1.12.39.
- English only. Parakeet TDT v2 is English-only. To swap in the multilingual
v3 model (~25 languages, ~3% accuracy hit on English), change
MODEL_DIR_NAMEinModelManager.kt:20tosherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8. - First model load takes ~3-5 s. Recognizer is built lazily on first use after process start — the keyboard appears instantly but the very first mic tap may have a small delay.
- No on-device punctuation model. Parakeet TDT v2 emits punctuation inline, generally well. If quality disappoints, sherpa-onnx ships a separate punctuation model you can chain.
- VAD silence threshold is fixed at 600 ms in
ModelManager.createVad(). Lower it for snappier auto-stop, raise it if it cuts off your slow speech.
- NVIDIA Parakeet TDT 0.6B v2 — the ASR model
- sherpa-onnx by k2-fsa — the on-device runtime
- Silero VAD — voice activity detection
- int8 model packaging by csukuangfj on Hugging Face
MIT licensed. See LICENSE for third-party component licenses.