fix(VoiceServer): cross-platform audio playback in playAudio()#1061
Open
MHoroszowski wants to merge 1 commit intodanielmiessler:mainfrom
Open
fix(VoiceServer): cross-platform audio playback in playAudio()#1061MHoroszowski wants to merge 1 commit intodanielmiessler:mainfrom
MHoroszowski wants to merge 1 commit intodanielmiessler:mainfrom
Conversation
playAudio() hardcoded /usr/bin/afplay, which is macOS-only. On Linux, every TTS notification fails with ENOENT and the voice server appears to work but produces no audio (the failure is swallowed by the fire-and-forget curl pattern used at the call sites). Extract player resolution into getAudioPlayer(): - darwin → afplay (unchanged) - linux + ffplay → ffplay -nodisp -autoexit -volume 0..100 - linux + mpg123 → mpg123 -f 0..32768 (PCM scale) - neither → throw with an actionable install hint ffplay is preferred because ffmpeg is widely preinstalled; mpg123 is the lightweight fallback. Both route through PulseAudio, so this works on native Linux and on Windows via WSL2 + WSLg out of the box. Verified on Ubuntu 24.04 / WSL2 (Windows 11): TTS audio plays through WSLg PulseAudio to Windows speakers with no additional configuration. Addresses the audio-playback half of danielmiessler#855. Complementary to danielmiessler#1030, which covers the desktop-notification half (osascript → notify-send) without overlap.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
VoiceServer/server.ts:playAudio()hardcodes/usr/bin/afplay, which is macOS-only. On Linux every TTS notification fails withENOENTand the voice server appears to work but produces no audio — the failure is swallowed by the fire-and-forgetcurlpattern at the call sites, so users see ✅ success in their terminal and silent speakers.This PR makes audio playback cross-platform. It is complementary to #1030 (which covers the desktop-notification half via
notify-send) and addresses the audio-playback half of #855 — neither file region overlaps.Change
Extract player resolution into a small
getAudioPlayer()helper, then call it fromplayAudio():darwin/usr/bin/afplay/usr/bin/ffplayffmpegis widely preinstalled/usr/bin/mpg123ENOENTVolume is preserved across players:
afplay -v(0..1 float),ffplay -volume(0..100 int),mpg123 -f(0..32768 PCM scale).Why ffplay first, mpg123 second
ffplayships withffmpegwhich is already a dependency on most modern dev boxes;mpg123is the well-known minimal fallback called out in #855. Trying both gives users a graceful path on minimal containers/distros without forcing a heavy install.Test plan
mpg123 -q -f 32768 /tmp/voice-*.mp3→ exit 0, audio plays)afplay -vinvocationplayAudio()and the new helperScope
playAudio()onlystart.sh/stop.sh/restart.sh(launchctl→ systemd-user) — separate PR, deserves its own discussionReferences