docs: update mkdocs yml, add source refs to models

rachwalk · rachwalk · commit 50b75efb72ad · 2025-05-19T16:29:11.000+02:00
diff --git a/docs/speech_to_speech/agents/asr.md b/docs/speech_to_speech/agents/asr.md
@@ -57,23 +57,6 @@ Adds a custom VAD model to a processing pipeline.
 
     The `'stop'` pipeline is present for forward compatibility. It currently doesn't affect Agent's functioning.
 
-### `_on_new_sample()`
-
-Callback function triggered for each new audio sample. Determines:
-
--   If recording should start
--   Whether to continue buffering
--   If grace period has ended
--   When to start transcription threads
-
-### `_transcription_thread(identifier)`
-
-Handles transcription for a given buffer in a background thread. Uses locks to ensure safe access to transcription model.
-
-### `_should_record(audio_data, input_parameters)`
-
-Evaluates the `should_record_pipeline` models to determine if recording should begin.
-
 ## Best Practices
 
 1. **Graceful Shutdown**: Always call `stop()` to ensure transcription threads complete.
diff --git a/docs/speech_to_speech/models/overview.md b/docs/speech_to_speech/models/overview.md
@@ -42,35 +42,59 @@ This method takes raw audio data encoded as 2-byte integers and returns the corr
 -   No additional setup required
 -   Returns a confidence value indicating the presence of speech in the audio
 
+??? info "SileroVAD"
+
+    ::: rai_s2s.asr.models.silero_vad.SileroVAD
+
 ### OpenWakeWord
 
 -   Open source project: [GitHub](https://github.com/dscripka/openWakeWord)
 -   Supports predefined and custom wake words
 -   Returns `True` when the specified wake word is detected in the audio
 
+??? info "OpenWakeWord"
+
+    ::: rai_s2s.asr.models.open_wake_word.OpenWakeWord
+
 ### OpenAIWhisper
 
 -   Cloud-based transcription model: [Documentation](https://platform.openai.com/docs/guides/speech-to-text)
 -   Requires setting the `OPEN_API_KEY` environment variable
 -   Offers language and model customization via the API
 
+??? info "OpenAIWhisper"
+
+    ::: rai_s2s.asr.models.open_ai_whisper.OpenAIWhisper
+
 ### LocalWhisper
 
 -   Local deployment of OpenAI Whisper: [GitHub](https://github.com/openai/whisper)
 -   Supports GPU acceleration
 -   Same configuration interface as OpenAIWhisper
 
+??? info "LocalWhisper"
+
+    ::: rai_s2s.asr.models.local_whisper.LocalWhisper
+
 ### FasterWhisper
 
 -   Optimized Whisper variant: [GitHub](https://github.com/SYSTRAN/faster-whisper)
 -   Designed for high speed and low memory usage
 -   Follows the same API as Whisper models
 
+??? info "FasterWhisper"
+
+    ::: rai_s2s.asr.models.local_whisper.FasterWhisper
+
 ### ElevenLabs
 
 -   Cloud-based TTS model: [Website](https://elevenlabs.io/)
 -   Requires the environment variable `ELEVENLABS_API_KEY` with a valid key
 
+??? info "ElevenLabs"
+
+    ::: rai_s2s.tts.models.elevenlabs_tts.ElevenLabsTTS
+
 ### OpenTTS
 
 -   Open source TTS solution: [GitHub](https://github.com/synesthesiam/opentts)
@@ -83,6 +107,10 @@ This method takes raw audio data encoded as 2-byte integers and returns the corr
 -   Provides a TTS server running on port 5500
 -   Supports multiple voices and configurations
 
+??? info "OpenTTS"
+
+    ::: rai_s2s.tts.models.open_tts.OpenTTS
+
 ## Custom Models
 
 ### Voice Detection Models
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -120,6 +120,8 @@ nav:
     - Overview: speech_to_speech/overview.md
     - Agents:
         - Overview: speech_to_speech/agents/overview.md
+        - Automatic Speech Recognition: speech_to_speech/agents/asr.md
+        - Text To Speech: speech_to_speech/agents/tts.md
     - Models:
         - Overview: speech_to_speech/models/overview.md
     - SoundDevice Connector: speech_to_speech/sounddevice.md