Replaced deprecated amazon-transcribe SDK with new aws-sdk-transcribe-streaming #4111

pabloFuente · 2025-11-27T13:42:49Z

The AWS Python SDK used by livekit-plugins-aws in the STT implementation (amazon-transcribe) has known critical CPU bottleneck performance issues and has been deprecated by AWS in favor of aws-sdk-transcribe-streaming.

Link to related amazon-transcribe issue: [Bug] BufferableByteStream causes 100% CPU usage in non-blocking scenarios awslabs/amazon-transcribe-streaming-sdk#121
Link to related livekit/agents issue: AWS STT plugin extremely slow when using multi-user transcription #3739

This PR replaces the old, deprecated, low-performance AWS SDK with the new official AWS SDK. I have tested the problematic scenario explained in this issue (#3739) and it is no longer a problem when using this PR. Transcribing multiple users from the same LiveKit Agent using livekit-plugin-aws STT is now possible, with no delays. The scope of the commit is quite small, so I do not expect any significant side effects.

…-streaming - Replace `amazon-transcribe` dependency with `aws_sdk_transcribe_streaming` in pyproject.toml - Bump minimum Python version to 3.12, as requried by the official AWS SDK. - Update `STT` implementation to use the new SDK client and models. - Refactor `SpeechStream._run` to use `TranscribeStreamingClient` from the new SDK. - Update audio input streaming to use `AudioStreamAudioEvent`. - Update transcript event processing to match the new SDK's event structure. - Remove custom credential resolver logic in favor of `EnvironmentCredentialsResolver`

Prevent race conditions and unhandled exceptions during participant disconnects: Changes: - Implement a graceful shutdown sequence: close input stream first, then wait for output stream to finish. - Use `asyncio.shield` to protect inner tasks from immediate cancellation, allowing for proper cleanup. - Suppress `concurrent.futures.InvalidStateError` in `handle_transcript_events` to avoid noise from AWS CRT bindings during shutdown. - Ensure `gather_future` is awaited to prevent "exception never retrieved" warnings. - Use `contextlib.suppress` for cleaner exception handling during resource cleanup.

pabloFuente · 2025-11-27T16:34:52Z

Unit test tests/test_chat_ctx.py is failing with an authorization error:

Error code: 401 - {'error': {'message': "You didn't provide an API key. You need to provide your API key in an Authorization header..."}

I don't think this has anything to do with the changes in my PR.

longcw

Thanks for creating the pr! Looks good to me, something nit:

livekit-plugins/livekit-plugins-aws/pyproject.toml

…ot satisfied

longcw

lgtm

pabloFuente mentioned this pull request Nov 27, 2025

AWS STT plugin extremely slow when using multi-user transcription #3739

Closed

pabloFuente added 3 commits November 27, 2025 16:44

Fix formatting issues

865df2a

More format fixes

590795d

longcw reviewed Nov 28, 2025

View reviewed changes

livekit-plugins/livekit-plugins-aws/pyproject.toml Outdated Show resolved Hide resolved

livekit-plugins/livekit-plugins-aws/pyproject.toml Outdated Show resolved Hide resolved

pabloFuente added 2 commits November 28, 2025 11:15

Support Python >=3.9.0. Raise error from STT plugin if dependencies n…

c25967a

…ot satisfied

Fix formatting error. Update uv.lock

977d355

longcw approved these changes Nov 28, 2025

View reviewed changes

davidzhao merged commit 391bbdd into livekit:main Nov 28, 2025
11 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replaced deprecated amazon-transcribe SDK with new aws-sdk-transcribe-streaming #4111

Replaced deprecated amazon-transcribe SDK with new aws-sdk-transcribe-streaming #4111

Uh oh!

pabloFuente commented Nov 27, 2025

Uh oh!

pabloFuente commented Nov 27, 2025

Uh oh!

longcw left a comment

Uh oh!

Uh oh!

Uh oh!

longcw left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Replaced deprecated amazon-transcribe SDK with new aws-sdk-transcribe-streaming #4111

Replaced deprecated amazon-transcribe SDK with new aws-sdk-transcribe-streaming #4111

Uh oh!

Conversation

pabloFuente commented Nov 27, 2025

Uh oh!

pabloFuente commented Nov 27, 2025

Uh oh!

longcw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

longcw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants