feat(web): add push-to-talk, VAD continuous listening, and voice settings#303
Open
P2Chill wants to merge 2 commits intomoltis-org:mainfrom
Open
feat(web): add push-to-talk, VAD continuous listening, and voice settings#303P2Chill wants to merge 2 commits intomoltis-org:mainfrom
P2Chill wants to merge 2 commits intomoltis-org:mainfrom
Conversation
…ings Add two new voice input modes alongside the existing toggle: Push-to-Talk (PTT): - Configurable hotkey (default F13, stored in localStorage) - Hold to record, release to send - Function keys work even when focused in text inputs - BroadcastChannel tab coordination prevents dual-tab recording Voice Activity Detection (VAD): - Energy-based continuous listening with conversation mode button - Exponential sensitivity curve (0-100%) configurable in settings - Auto-sends after 2.5s silence, 30s max recording safety valve - Mutes during TTS playback, auto-resumes after with echo settle delay - AudioContext health monitoring with auto-resume on browser suspension - MediaStream track health check with automatic reacquisition - Race condition guards (vadTranscribing flag) prevent recorder restart storms during async transcription fetches - EBML header validation catches corrupt WebM blobs before API submission - 15s fetch timeout prevents stuck transcription state Voice Settings UI: - PTT key picker (click to listen, press any key to rebind) - VAD sensitivity slider with real-time threshold preview - Waveform icon button with CSS states (listening glow, speech pulse) Also adds i18n keys for en/fr/zh locales.
Sets commit statuses (local/lint, local/test, etc.) that the upstream local-validation jobs poll for. Required because upstream ci.yml skips actual checks on pull_request events from forks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add two new voice input modes alongside the existing toggle:
Push-to-Talk (PTT)
Voice Activity Detection (VAD)
vadTranscribingflag) prevent recorder restart storms during async transcription fetchesVoice Settings UI
Also adds i18n keys for en/fr/zh locales.
Notes
.github/workflows/local-ci.ymlfile is fork-specific CI infrastructure (setslocal/*commit statuses for the upstream local-validation polling jobs, since upstreamci.ymlskips checks on fork PRs). Happy to drop it if you prefer.