Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
131 commits
Select commit Hold shift + click to select a range
69281f3
support timestamps for numbers.
Beronx86 Jan 9, 2025
65b2332
make align a bit faster.
Beronx86 Jan 9, 2025
4ebfb07
make no beam consistent with backtrack.
Beronx86 Jan 9, 2025
22a93f2
Merge branch 'main' into main
bfs18 Jan 13, 2025
289eadf
fix a merge error.
Beronx86 Jan 13, 2025
ffbc736
change the docstrings and comments to English
Beronx86 Jan 13, 2025
12604a4
Merge pull request #986 from bfs18/main
m-bain Jan 14, 2025
235536e
Update links to language models in README
MJochim Mar 25, 2024
70c639c
doc: refer to DEFAULT_ALIGN_MODELS_HF for other langs
Barabazs Jan 17, 2025
86e2b3e
chore: remove deprecated VAD_SEGMENTATION_URL
Barabazs Jan 17, 2025
355f8e0
Merge pull request #1003 from Barabazs/chore/remove-aws-url
m-bain Jan 17, 2025
de0d8fe
chore: handle empty segments_list case in silero
tan90xx Jan 19, 2025
2117909
Merge pull request #1005 from tan90xx/main
m-bain Jan 19, 2025
fca563a
Update silero.py
tan90xx Jan 20, 2025
acbeba6
Update silero.py
tan90xx Jan 20, 2025
8bfa121
Merge pull request #1006 from tan90xx/main
m-bain Jan 20, 2025
36d2622
feat: add Latvian align model
slikts Jan 24, 2025
7b3c9ce
Add models_cache_only param
philmcmahon Jan 27, 2025
44e8bf5
Merge pull request #1024 from philmcmahon/local-files-only-param
m-bain Jan 27, 2025
272714e
feat: use uv for building package
Barabazs Jan 16, 2025
63bc190
feat: update Python compatibility workflow to use uv
Barabazs Jan 16, 2025
b41ebd4
chore: add numpy to deps
Barabazs Jan 16, 2025
90256cc
feat: use uv recommended setup
Barabazs Jan 16, 2025
7489ebf
feat: update build and release workflow to use uv for package install…
Barabazs Jan 16, 2025
d2f0e53
chore: remove tmp workflow
Barabazs Jan 17, 2025
f8d11df
docs: Update README example commands with generic audio path
Barabazs Feb 19, 2025
4db8390
feat: add Tagalog (tl - Filipino) Phoneme-based ASR Model (#1067)
mtfugin Feb 23, 2025
0d9807a
feat: add Basque alignment model (#1074)
xezpeleta Mar 4, 2025
8c58c54
Revert "feat: add Basque alignment model (#1074)" (#1077)
Barabazs Mar 5, 2025
3205436
Merge pull request #1002 from Barabazs/feat/uv
m-bain Mar 23, 2025
8e53866
feat: pass hotwords argument to get_prompt (#1073)
jademlc Mar 24, 2025
e7712f4
refactor: update import statements to use explicit module paths acros…
Barabazs Mar 25, 2025
a7564c2
docs: update installation instructions
Barabazs Mar 25, 2025
f10dbf6
fix: update setuptools configuration to include package discovery for…
Barabazs Mar 25, 2025
0aed874
Remove duplicated item
yccheok Apr 11, 2025
cd59f21
fix: downgrade ctranslate2 dependency version
Barabazs May 1, 2025
ac0c8bd
feat: add version and Python version arguments to CLI
Barabazs May 1, 2025
f5b40b5
chore: update version to 3.3.3 in pyproject.toml and uv.lock
Barabazs May 1, 2025
d2a493e
refactor: implement lazy loading for module imports in whisperx
Barabazs May 1, 2025
7d36b83
refactor: update CLI entry point
Barabazs May 1, 2025
36d552c
fix: remove DiarizationPipeline from public API
Barabazs May 2, 2025
b2d50a0
chore: bump version
Barabazs May 3, 2025
108bd0c
chore: add lockfile check step to CI workflows
Barabazs May 3, 2025
5012650
chore: update lockfile
Barabazs May 3, 2025
6fe0a87
docs: add troubleshooting section for libcudnn dependencies in README
Barabazs May 31, 2025
b343241
feat: add diarize_model arg to CLI (#1101)
bgdnvk May 31, 2025
d700b56
docs: add missing torch import to Python usage example in README
hammerill Jun 7, 2025
1631c30
feat: enhance diarization with optional output of speaker embeddings
eek Mar 21, 2025
220fec9
refactor: update type hints in diarization module (PEP 585)
eek Apr 3, 2025
844736e
style: minor code formatting
Barabazs Jun 24, 2025
b93e9b6
chore: bump version to 3.4.0
Barabazs Jun 24, 2025
ffedc5c
fix: speaker embedding bug (#1178)
Barabazs Jun 25, 2025
e0833da
Fix: Ensure integer tensor indexing in get_wildcard_emission()
HowardWhile May 15, 2025
429658d
chore: bump version to 3.4.2
Barabazs Jun 27, 2025
f4261f3
Remove unused code in Vad class
3manifold Mar 7, 2025
2d9ce44
fix(asr): load VAD model on correct CUDA device (#835)
duj12 Jul 2, 2025
83afb81
fix: restrict pyannote-audio version to avoid compatibility issues (#…
Barabazs Oct 1, 2025
c7d3188
Add jr, sr, and ph.d to punkt abbreviations
alexcannan Feb 18, 2025
ed13dc8
recall.ai sponsor
m-bain Oct 2, 2025
bf150e4
feat: update Punkt tokenizer to use pre-trained model and handle miss…
Barabazs Oct 2, 2025
b1c8ac7
Change alignment model for Vietnamese language
nguyenvulebinh Apr 11, 2024
95fecb9
build: upgrade PyTorch to 2.7.1 with CUDA 12.8 and multi-platform sup…
jim60105 Oct 8, 2025
c266ac5
chore: update version to 3.5.0
Barabazs Oct 8, 2025
2663f2e
doc: fix diarize import in example script (#1192)
awan1 Oct 9, 2025
64e307c
chore: remove redundant variable & improve load_model function docume…
3manifold Oct 9, 2025
027ec57
doc: update cpu only example (#1164)
felagund Oct 9, 2025
3b1b9a8
refactor: rename types.py to schema.py to avoid stdlib conflict
Barabazs Oct 9, 2025
a51ae7a
feat: add centralized logging to replace ad-hoc print statements (#1254)
Barabazs Oct 10, 2025
c1c08c4
bump: update version to 3.6.0
Barabazs Oct 10, 2025
d13171c
feat: add support for python 3.13 (#1256)
Barabazs Oct 10, 2025
a58ff9c
bump: update version to 3.7.0
Barabazs Oct 10, 2025
895e5a8
chore: update numpy dependency constraints for Python 3.13 compatibil…
Barabazs Oct 12, 2025
505bd9c
chore: refine triton dependency to restrict installation to x86_64 L…
Barabazs Oct 12, 2025
0fa81b3
feat: add Swedish alignment model (#1110)
Npahlfer Oct 15, 2025
92227e7
fix: lock down torch and torchaudio versions (#1265)
Barabazs Oct 16, 2025
617835d
chore: upgrade torch and torchaudio dependencies to 2.8.0
Barabazs Oct 16, 2025
5925e5f
docs: add cuDNN troubleshooting for common issues (#1266)
Barabazs Oct 16, 2025
c8f7597
feat: add hotwords argument to CLI for improved recognition of rare t…
Barabazs Oct 17, 2025
6e1d1ca
fix: incorrect type annotation in get_writer return value
JulianFP May 13, 2025
db317c3
feat: add language-aware sentence tokenization (#1269)
pplkit Oct 21, 2025
d32ec3e
fix: add missing comma
Barabazs Oct 21, 2025
9de90e2
fix: pin huggingface-hub<1.0.0 for pyannote-audio compatibility (#1327)
Barabazs Jan 27, 2026
6ec4a02
chore: drop python 3.9 support (#1328)
Barabazs Jan 27, 2026
7892a72
Optimize assign_word_speakers with interval tree for 228x speedup
Mr-Neutr0n Feb 7, 2026
66ada29
Merge pull request #1338 from Mr-Neutr0n/perf/interval-tree-speaker-a…
m-bain Feb 8, 2026
2b5ae8c
[BugFix] Type hint fix in decode_batch List[str] not str:
1carlito Feb 9, 2026
570b08b
fix: add no_repeat_ngram_size and repetition_penalty options to Whisp…
Feb 10, 2026
863a986
Merge pull request #1340 from RickSanchez93/main
m-bain Feb 10, 2026
cc6f627
[BugFix] The variable I removed was not being used anyhwere.
1carlito Feb 10, 2026
f2d853a
Merge pull request #1342 from 1carlito/bugs
1carlito Feb 10, 2026
741ab9a
Merge pull request #1343 from m-bain/fix-type-hint-decode-batch
1carlito Feb 10, 2026
c4c1242
fix: derive SRT/VTT cue times from word-level timestamps (#1347)
Barabazs Feb 13, 2026
6187d25
feat: migrate to pyannote-audio v4 with speaker-diarization-community…
Barabazs Feb 13, 2026
9d687e0
fix: propagate --model_dir and --model_cache_only to all model loadin…
MrPrayer Feb 14, 2026
1baf8d2
feat: pass --hf_token to WhisperModel for gated model support
Barabazs Feb 14, 2026
42beab1
chore: bump version to 3.8.1
Barabazs Feb 14, 2026
1430e43
[fix] Batch context is updated each time.
1carlito Feb 14, 2026
e33bb1e
Although the existing commit worked, inital prompt was in the loop no…
1carlito Feb 14, 2026
de5fa65
[modification]
1carlito Feb 14, 2026
1b6a3b7
[feat] First batch wrap around
1carlito Feb 17, 2026
422c44f
Merge pull request #1355 from 1carlito/batch_wrap
1carlito Feb 17, 2026
0e073d4
Revert "Batch wrap"
1carlito Feb 17, 2026
4017efc
Merge pull request #1356 from m-bain/revert-1355-batch_wrap
1carlito Feb 17, 2026
d8a078e
feat: expose avg_logprob per segment from ctranslate2 beam search
claude Feb 14, 2026
064f737
fix: default compute_type to float32 on CPU to avoid float16 ValueError
Feb 22, 2026
f2609a6
fix: revert #986 wildcard alignment that broke word-level timestamps
claude Mar 10, 2026
636f298
fix: use blank_id parameter instead of hardcoded 0 in trellis and bac…
claude Mar 10, 2026
6d3edb1
chore: bump version
Barabazs Mar 10, 2026
d00ec69
feat: add progress_callback to transcribe, align, and diarize
claude Mar 11, 2026
646f511
fix: remove dead model_bytes read that leaked file handle
claude Mar 17, 2026
39aa9f5
fix: restore word-level timestamps for unalignable characters (#1372)
claude Mar 15, 2026
da072d6
test: add regression test for #1372 (digits+comma get no timestamps)
claude Mar 15, 2026
f9a3f8f
ci: add pytest dev dependency and test workflow
claude Mar 15, 2026
94f60aa
chore: bump version to 3.8.3
Barabazs Mar 25, 2026
8efddaa
fix: require faster-whisper>=1.2.0 for use_auth_token support (#1385)
claude Mar 25, 2026
095b36b
chore: bump version to 3.8.4
Barabazs Mar 25, 2026
5efa859
fix: add torchvision~=0.23.0 dep to prevent torch version mismatch
claude Apr 1, 2026
03f5017
fix: cap torchcodec <0.8.0 for torch 2.8 compatibility
claude Apr 1, 2026
4a6477e
chore: bump version to 3.8.5
claude Apr 1, 2026
1c4b23e
feat: add Indonesian language model to alignment (#1400)
aziib Apr 4, 2026
9ccb281
fix: handle 'ignore' interpolation method in interpolate_nans (#1368)
claude May 25, 2026
2b52366
build(deps): bump nltk from 3.9.2 to 3.9.4
dependabot[bot] May 25, 2026
59720ca
ci: add zizmor workflow and harden existing workflows
claude May 25, 2026
1e41377
ci: align actions/checkout to v6.0.2 across all workflows
claude May 25, 2026
e6ad0ca
ci: replace softprops/action-gh-release with gh CLI
claude May 25, 2026
a75cd50
chore(deps): update exclude-newer settings
Barabazs May 25, 2026
4cbcd1a
chore(deps): update uv version to 0.11.6 in workflows
Barabazs May 25, 2026
5d8c276
fix: correct syntax error in uv version specification
Barabazs May 25, 2026
3ccc17b
chore: bump whisperx to 3.8.6
Barabazs May 25, 2026
11bc7de
Fix Windows CUDA detection: include AMD64 in platform markers (#1357)
deekshaNVIDIA May 15, 2026
5f2f9d4
chore: regenerate uv.lock for AMD64 markers
Barabazs Jun 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 23 additions & 16 deletions .github/workflows/build-and-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,39 @@ on:
release:
types: [published]

permissions: {}

jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout
uses: actions/checkout@v4
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false

- name: Set up Python
uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5.4.2
with:
python-version: "3.9"
version: "0.11.6"
python-version: "3.10"
enable-cache: false

- name: Install dependencies
run: |
python -m pip install build
- name: Check if lockfile is up to date
run: uv lock --check

- name: Build wheels
run: python -m build --wheel
- name: Build package
run: uv build

- name: Release to Github
uses: softprops/action-gh-release@v2
with:
files: dist/*
run: gh release upload "$RELEASE_TAG" dist/*.whl
env:
GH_TOKEN: ${{ github.token }}
RELEASE_TAG: ${{ github.event.release.tag_name }}

- name: Publish package to PyPi
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
run: uv publish
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_API_TOKEN }}
28 changes: 18 additions & 10 deletions .github/workflows/python-compatibility.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,36 @@ on:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch: # Allows manual triggering from GitHub UI
workflow_dispatch: # Allows manual triggering from GitHub UI

permissions: {}

jobs:
test:
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5.4.2
with:
version: "0.11.6"
python-version: ${{ matrix.python-version }}

- name: Install package
run: |
python -m pip install --upgrade pip
pip install .
- name: Check if lockfile is up to date
run: uv lock --check

- name: Install the project
run: uv sync --all-extras

- name: Test import
run: |
python -c "import whisperx; print('Successfully imported whisperx')"
uv run python -c "import whisperx; print('Successfully imported whisperx')"
36 changes: 36 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Tests

on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:

permissions: {}

jobs:
test:
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false

- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5.4.2
with:
version: "0.11.6"
python-version: ${{ matrix.python-version }}

- name: Install the project
run: uv sync --all-extras

- name: Run tests
run: uv run pytest tests/ -v
35 changes: 0 additions & 35 deletions .github/workflows/tmp.yml

This file was deleted.

26 changes: 26 additions & 0 deletions .github/workflows/zizmor.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: GitHub Actions Security Analysis with zizmor

on:
push:
branches: [main]
pull_request:
branches: ["**"]

permissions: {}

jobs:
zizmor:
name: Run zizmor
runs-on: ubuntu-latest
permissions:
security-events: write
contents: read
actions: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false

- name: Run zizmor
uses: zizmorcore/zizmor-action@5f14fd08f7cf1cb1609c1e344975f152c7ee938d # v0.5.6
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10
76 changes: 76 additions & 0 deletions CUDNN_TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Troubleshooting cuDNN Loading Errors

This guide helps resolve common cuDNN-related errors when running WhisperX on GPU. These issues typically occur when the system can't locate cuDNN libraries or finds conflicting versions.

## Unable to Load cuDNN Libraries

If you encounter the following error when running WhisperX:

`Unable to load any of {libcudnn_cnn.so.9.1.0, libcudnn_cnn.so.9.1, libcudnn_cnn.so.9, libcudnn_cnn.so}`

This means the cuDNN libraries are installed (via whisperx dependencies) but aren't in a location where the system's dynamic linker can find them.

### Solution 1: Add to LD_LIBRARY_PATH (Recommended)

Add this at the start of your Python script or notebook:

```python
import os

# Get current LD_LIBRARY_PATH
original = os.environ.get("LD_LIBRARY_PATH", "")

cudnn_path = "/usr/local/lib/python3.12/dist-packages/nvidia/cudnn/lib/"
os.environ['LD_LIBRARY_PATH'] = original + ":" + cudnn_path
```

**Note:** Adjust the Python version (`python3.12`) to match your environment.

### Solution 2: Symlink to LD_LIBRARY_PATH Directory

If Solution 1 didn't work and you still get the "unable to load" error, symlink the libraries to a directory that's already in your `LD_LIBRARY_PATH`:

1. Check what's in your LD_LIBRARY_PATH: `echo "$LD_LIBRARY_PATH"`
2. Assuming that there is only one path set.
Symlink the downloaded libcudnn files to that path:
`ln -s /usr/local/lib/python3.12/dist-packages/nvidia/cudnn/lib/libcudnn* "$LD_LIBRARY_PATH"/`

**Note:** If `LD_LIBRARY_PATH` contains multiple paths (separated by `:`), pick one directory and use it instead of `"$LD_LIBRARY_PATH"`. For example: `/usr/lib/x86_64-linux-gnu/`

## cuDNN Version Incompatibility

If you encounter this error:

```
RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (9, 10, 2) but found runtime version (9, 2, 1)
```

This means PyTorch is finding a different cuDNN version than the one it was compiled with. **PyTorch comes bundled with its own cuDNN**, but a conflicting cuDNN in `LD_LIBRARY_PATH` is taking precedence.

### Solution: Remove Conflicting cuDNN from Path

Check if there's a conflicting cuDNN path:

```bash
echo $LD_LIBRARY_PATH
```

If you see paths pointing to older cuDNN installations (e.g., system-installed cuDNN or manually downloaded), try one of these:

**Option 1: Clear LD_LIBRARY_PATH temporarily**

```python
import os
# Let PyTorch use its bundled cuDNN
os.environ.pop('LD_LIBRARY_PATH', None)
```

**Option 2: Set LD_LIBRARY_PATH to only the correct version**

```python
import os
# Point only to the cuDNN that matches PyTorch's compiled version
os.environ['LD_LIBRARY_PATH'] = "/usr/local/lib/python3.12/dist-packages/nvidia/cudnn/lib/"
```

**Note:** This error is unlikely on a clean install. If it occurs anyway, [open an issue](https://github.com/m-bain/whisperX/issues). If you've modified system libraries or CUDA/cuDNN, the options above should help resolve most cases.
Loading