STT: ElevenLabs STTv2 (Scribe v2 Realtime) support #3954

varghesepaul · 2025-11-16T23:28:51Z

Summary

Adds support for ElevenLabs Scribe v2 Realtime streaming STT with ~150ms latency.

Features

WebSocket-based streaming transcription
Configurable commit strategies (VAD/manual)
Word-level timestamp support
Automatic reconnection handling
Comprehensive error handling

API Options

model_id: Model selection (default: scribe_v2_realtime)
language_code: Language support (optional)
commit_strategy: "vad" (default) or "manual"
include_timestamps: Enable word-level timestamps
VAD parameters: threshold, silence duration, speech duration

Implementation Details

Follows Deepgram STTv2 pattern for consistency
Uses RecognizeStream base class (modern API)
Proper usage tracking via RECOGNITION_USAGE events
Session override support via update_options()

Known Issues

ElevenLabs API currently returns duplicate transcripts in some scenarios. I've reported this to ElevenLabs
(elevenlabs/elevenlabs-python#686). No explicit deduplication logic added as it risks removing valid repeated content.

Documentation

STT - Realtime : https://elevenlabs.io/docs/cookbooks/speech-to-text/streaming , https://elevenlabs.io/docs/api-reference/speech-to-text/v-1-speech-to-text-realtime

CLAassistant · 2025-11-16T23:28:57Z

All committers have signed the CLA.

- Add STTv2 class with full Scribe v2 Realtime API support - Support word-level timestamps (include_timestamps parameter) - Support both VAD and manual commit strategies - Emit INTERIM_TRANSCRIPT events for real-time UI feedback - Handle committed_transcript_with_timestamps events - Add update_options() method for dynamic reconfiguration - Comprehensive error handling and logging - Full docstrings with examples

varghesepaul · 2025-11-18T01:22:39Z

ElevenLabs has fixed the issue elevenlabs/elevenlabs-python#686, and the latest test results look good.

livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt_v2.py

longcw · 2025-11-18T09:20:22Z

livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt_v2.py

+        audio_format: STTAudioFormat = "pcm_16000",
+        commit_strategy: str = "vad",
+        include_timestamps: bool = False,
+        vad_silence_threshold_secs: float = 1.5,


what is the default value from 11labs, can we use NOT_GIVEN as default? also, during my testing, the transcripts didn't get committed after 1.5s, does that related to the configuration?

Yeah, this ended up being a config issue.

ElevenLabs’ default commit_strategy="manual" basically turns off all the VAD settings. So when we tested with the API defaults, vad_silence_threshold_secs wasn’t doing anything — which is why transcripts weren’t auto-committing.

The PR fixes that by switching the default to commit_strategy="vad", which brings back the 1.5s auto-commit behavior.

We tried a few settings in prod to make things snappier, especially for quick phrases like “Hello?”: vad_silence_threshold_secs = 0.6 (much faster than the 1.5s default)
min_silence_duration_ms = 150 (more stable than 100ms)

The old 1.5s threshold was causing single-word phrases to stall for up to 22 seconds because background noise blocked clean silence detection. Dropping it to 0.6s fixed the lag without hurting accuracy.

Also — should we keep the default at 0.6 in the PR? We can either match ElevenLabs’ default, use NOT_GIVEN, or just go with 0.6 since that aligns with the vad strategy and seems to work best in practice. I’m leaning toward setting it to 0.6 by default.

Our stt config:

stt_instance = elevenlabs.STTv2(
model_id="scribe_v2_realtime",
vad_silence_threshold_secs=.6,
vad_threshold=.4,
min_silence_duration_ms=150,
)

…venlabs/stt_v2.py Co-authored-by: Long Chen <[email protected]>

varghesepaul force-pushed the elevenlabs-scribeV2-realtime branch 2 times, most recently from 75f1045 to 5567d85 Compare November 17, 2025 01:52

varghesepaul force-pushed the elevenlabs-scribeV2-realtime branch from 5567d85 to e976fb8 Compare November 17, 2025 01:59

longcw reviewed Nov 18, 2025

View reviewed changes

varghesepaul and others added 2 commits November 18, 2025 09:20

Update livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/ele…

5d02c65

…venlabs/stt_v2.py Co-authored-by: Long Chen <[email protected]>

Fix Literal import

5d99999

varghesepaul requested a review from longcw November 18, 2025 20:43

varghesepaul force-pushed the elevenlabs-scribeV2-realtime branch from 14f8bb2 to 5d99999 Compare November 19, 2025 06:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

STT: ElevenLabs STTv2 (Scribe v2 Realtime) support #3954

STT: ElevenLabs STTv2 (Scribe v2 Realtime) support #3954

Uh oh!

varghesepaul commented Nov 16, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Nov 16, 2025 •

edited

Loading

Uh oh!

varghesepaul commented Nov 18, 2025

Uh oh!

Uh oh!

longcw Nov 18, 2025

Uh oh!

varghesepaul Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

STT: ElevenLabs STTv2 (Scribe v2 Realtime) support #3954

Are you sure you want to change the base?

STT: ElevenLabs STTv2 (Scribe v2 Realtime) support #3954

Uh oh!

Conversation

varghesepaul commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

API Options

Implementation Details

Known Issues

Documentation

Uh oh!

CLAassistant commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

varghesepaul commented Nov 18, 2025

Uh oh!

Uh oh!

longcw Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

varghesepaul Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

varghesepaul commented Nov 16, 2025 •

edited

Loading

CLAassistant commented Nov 16, 2025 •

edited

Loading