Python scripts for stress testing Deepgram Speech-to-Text (STT) endpoints deployed on AWS SageMaker. Two input modes are supported:
stt_microphone_stress.py— streams live microphone audio for real-time transcriptionstt_wav_stress.py— streams a WAV file or sends batch HTTP requests; supports multiple simultaneous connections for load testing
- Python 3.14+
- uv package manager
- AWS credentials configured (CLI, environment variables, or IAM role)
- A deployed Amazon SageMaker endpoint running a Deepgram STT model
cd python-stt
uv syncmacOS — microphone support requires PortAudio:
brew install portaudio
uv syncStreams live microphone audio to Deepgram on SageMaker for real-time transcription. Supports multiple simultaneous connections for load testing.
Basic usage (single connection):
uv run stt_microphone_stress.py your-endpoint-nameWith a specific AWS region:
uv run stt_microphone_stress.py your-endpoint-name --region us-west-2Multiple simultaneous connections (load testing):
uv run stt_microphone_stress.py your-endpoint-name --connections 5With speaker diarization:
uv run stt_microphone_stress.py your-endpoint-name --diarize trueWith a different model and language:
uv run stt_microphone_stress.py your-endpoint-name --model nova-2 --language esWith keywords boosting (nova-2 only):
uv run stt_microphone_stress.py your-endpoint-name --keywords "Deepgram:5,SageMaker:10,transcription:3"Run for a fixed duration (useful for automated tests):
uv run stt_microphone_stress.py your-endpoint-name --duration 30Timed load test with multiple connections:
uv run stt_microphone_stress.py your-endpoint-name --connections 5 --duration 120Full example with all options:
uv run stt_microphone_stress.py your-endpoint-name \
--connections 3 \
--model nova-2 \
--language en \
--diarize true \
--punctuate true \
--keywords "hello:5,world:10" \
--duration 60 \
--region us-east-1 \
--log-level DEBUG| Option | Description | Default |
|---|---|---|
endpoint_name |
SageMaker endpoint name (required) | — |
--connections N |
Number of simultaneous streaming connections | 1 |
--model MODEL |
Deepgram model | nova-3 |
--language LANG |
Language code | en |
--diarize true|false |
Enable speaker diarization | false |
--punctuate true|false |
Enable punctuation | true |
--keywords WORD:N,... |
Keyword boosting with intensity, e.g. hello:5,world:10 (nova-2 only) |
— |
--duration SECONDS |
Stop after this many seconds | run until Ctrl+C |
--region REGION |
AWS region | us-east-1 |
--log-level LEVEL |
DEBUG, INFO, WARNING, ERROR, CRITICAL |
INFO |
Supports two sub-commands: stream and batch.
Requirements: The WAV file must be 16-bit PCM (linear16). To convert any audio file, use
ffmpeg:ffmpeg -i input.mp3 -ar 16000 -ac 1 -sample_fmt s16 output.wav
Streams the WAV file to Deepgram on SageMaker in real-time, paced to match the file's actual sample rate. Behaves like a live microphone source, enabling repeatable and automated load testing without requiring a physical microphone.
Basic usage (single connection, play file once):
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wavWith a specific AWS region:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav --region us-west-2Multiple simultaneous connections (load testing):
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav --connections 5Loop the file continuously until Ctrl+C:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav --loopLoop for a fixed duration (useful for automated tests):
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav --loop --duration 120Timed load test with multiple connections:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--connections 10 --loop --duration 300Gradual ramp-up (open 5 connections at a time, 3 seconds apart):
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--connections 20 --batch-size 5 --batch-delay 3With speaker diarization:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav --diarize trueWith a different model and language:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--model nova-2 --language esWith keywords boosting (nova-2 only) or keyterms (nova-3):
# nova-2 keywords
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--keywords "Deepgram:5,SageMaker:10"
# nova-3 keyterms
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--keyterms "Deepgram,SageMaker"With PII redaction:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--redact "pii,ssn,email_address"Full example with all stream options:
uv run stt_wav_stress.py stream your-endpoint-name --file audio.wav \
--connections 20 \
--batch-size 5 \
--batch-delay 3 \
--model nova-3 \
--language en \
--diarize true \
--punctuate true \
--keyterms "Deepgram,SageMaker" \
--redact "pii,ssn" \
--interim-results true \
--loop \
--duration 60 \
--region us-east-2 \
--log-level DEBUG| Option | Description | Default |
|---|---|---|
--file WAV_FILE |
Path to a 16-bit PCM WAV file (required) | — |
--connections N |
Total number of simultaneous streaming connections | 1 |
--batch-size N |
Connections to open per batch; streaming begins immediately for each batch | all at once |
--batch-delay SECONDS |
Seconds to wait between opening connection batches | 0 |
--model MODEL |
Deepgram model | nova-3 |
--language LANG |
Language code | en |
--diarize true|false |
Enable speaker diarization | false |
--punctuate true|false |
Enable punctuation | true |
--keywords WORD:N,... |
Keyword boosting with intensity, e.g. hello:5,world:10 (nova-2 only) |
— |
--keyterms TERM,... |
Comma-separated keyterms to boost recognition (nova-3) | — |
--redact ENTITY,... |
Comma-separated entity types to redact, e.g. pii,ssn,email_address |
— |
--interim-results true|false |
Emit interim (partial) transcripts | true |
--loop |
Loop the WAV file until --duration is reached or Ctrl+C |
off |
--duration SECONDS |
Stop after this many seconds | play file once |
--region REGION |
AWS region | us-east-2 |
--log-level LEVEL |
DEBUG, INFO, WARNING, ERROR, CRITICAL |
INFO |
Posts the entire WAV file in a single HTTP request using the SageMaker InvokeEndpoint API. Supports configurable parallelism via --concurrency for throughput and latency stress testing. Each concurrent request runs on its own Python thread with its own boto3 client. After all requests complete, a summary table shows min/avg/p95/max latency, throughput, and success/failure counts.
Note: SageMaker
InvokeEndpointhas a 6 MB request body limit. For larger files, usestreammode or split the file:ffmpeg -i input.wav -f segment -segment_time 60 segment_%03d.wav
Basic usage (single request):
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wavSend 10 concurrent requests (load testing):
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav --concurrency 10Send 100 total requests, 10 at a time:
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav \
--concurrency 10 --requests 100With a different model and language:
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav \
--model nova-2 --language esWith keyterms (nova-3):
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav \
--keyterms "Deepgram,SageMaker"With speaker diarization and PII redaction:
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav \
--diarize true --redact "pii,ssn,email_address"Full example with all batch options:
uv run stt_wav_stress.py batch your-endpoint-name --file audio.wav \
--concurrency 5 \
--requests 50 \
--model nova-3 \
--language en \
--diarize true \
--punctuate true \
--keyterms "Deepgram,SageMaker" \
--redact "pii,ssn" \
--region us-east-2 \
--log-level DEBUG| Option | Description | Default |
|---|---|---|
--file WAV_FILE |
Path to a 16-bit PCM WAV file, max 6 MB (required) | — |
--concurrency N |
Number of requests to run in parallel | 1 |
--requests N |
Total number of requests to send | same as --concurrency |
--model MODEL |
Deepgram model | nova-3 |
--language LANG |
Language code | en |
--diarize true|false |
Enable speaker diarization | false |
--punctuate true|false |
Enable punctuation | true |
--keyterms TERM,... |
Comma-separated keyterms to boost recognition (nova-3) | — |
--redact ENTITY,... |
Comma-separated entity types to redact, e.g. pii,ssn,email_address |
— |
--region REGION |
AWS region | us-east-2 |
--log-level LEVEL |
DEBUG, INFO, WARNING, ERROR, CRITICAL |
INFO |