Skip to content

Conversation

@longcw
Copy link
Contributor

@longcw longcw commented Nov 21, 2025

clean up #3909
close #3881

@longcw longcw requested a review from a team November 21, 2025 03:35

class VADOptions(TypedDict, total=False):
vad_silence_threshold_secs: float | None
"""Silence threshold in seconds for VAD. Default to 1.5"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems kind of long for realtime? does it mean that it'll mark the end of speech after 1.5s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.5s is the default value from 11labs, and yeah I think it's too long. actually the server VAD there is very sensitive to background noise, it tends to never stop during my testing...

their default commit strategy is manual... so I think a proper way to use their realtime model is to combine it with a local VAD #4043

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so to be clear, they do not send final transcript until VAD is clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

*,
language: NotGivenOr[str] = NOT_GIVEN,
conn_options: APIConnectOptions = DEFAULT_API_CONNECT_OPTIONS,
) -> SpeechStream:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we throw here if use_realtime isn't set to true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stt.capabilities.streaming will be False if use_realtime isn't set, ideally the agent framework won't use stt.stream in this case

Copy link
Contributor

@chenghao-mou chenghao-mou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested locally. Though I did see a bunch of hallucinated transcripts.

@longcw longcw merged commit 0263b0c into main Nov 22, 2025
17 of 18 checks passed
@longcw longcw deleted the longc/11labs-stt-realtime branch November 22, 2025 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ElevenLabs Scribe v2 Realtime

4 participants