Audio file (.wav format).
Example
input: data/demo.wav
(Wav file from https://github.com/pyannote/pyannote-audio/tree/develop/pyannote/audio/sample)
[ 00:00:06.714 --> 00:00:07.003] A speaker91
[ 00:00:07.003 --> 00:00:07.173] B speaker90
[ 00:00:07.580 --> 00:00:08.310] C speaker91
[ 00:00:08.310 --> 00:00:09.923] D speaker90
[ 00:00:09.923 --> 00:00:10.976] E speaker91
[ 00:00:10.466 --> 00:00:14.745] F speaker90
[ 00:00:14.303 --> 00:00:17.886] G speaker91
[ 00:00:18.022 --> 00:00:21.502] H speaker90
[ 00:00:18.157 --> 00:00:18.446] I speaker91
[ 00:00:21.774 --> 00:00:28.531] J speaker91
[ 00:00:27.886 --> 00:00:29.991] K speaker90
This model recommends additional module.
$ pip3 install -r requirements.txt
Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.
For the sample
$ python pyannote-audio.py -i ./data/sample.wav
For the sample with plot
$ python pyannote-audio.py -i ./data/sample.wav --plt
For the sample with verification
$ python pyannote-audio.py -i ./data/sample.wav -g ./data/sample.rttm
If you want to specify the audio, put the file path after the --i
or -input
option.
$ python pyannote-audio.py --i FILE_PATH
If you want to specify the ground truth, put the file path after the --ig
or -input_ground
option.
$ python pyannote-audio.py --ig FILE_PATH
If you want to specify the output file, put the file path after the --o
or -output
option.
$ python pyannote-audio.py --o FILE_PATH
If you want to specify the output ground truth file, put the file path after the --og
or -output_ground
option.
$ python pyannote-audio.py --og FILE_PATH
If you know the number of speakers, put the numper --num
or -num_speaker
option.
$ python pyannote-audio.py --num 2
If you know the maxisimum number of speakers, put the numper --max
or -max_speaker
option.
$ python pyannote-audio.py --max 4
If you know the minimum number of speakers, put the numper --min
or -min_speaker
option.
$ python pyannote-audio.py --min 2
By giving the --e
or -error
option, you can get diarization error rate.
$ python pyannote-audio.py --use_onnx
By giving the --plt
option, you can visualize results.
$ python pyannote-audio.py --use_onnx
By giving the --use_onnx
option, you can use onnx.
$ python pyannote-audio.py --use_onnx
By giving the --embed
option, you can get embedding vector in the input file.
$ python pyannote-audio.py --embed
- Pyannote-audio
- Hugging Face - pyannote in speaker-diariazation
- Hugging Face - hdbrain in wespeaker-voxceleb-resnet34-LM
- KaldiFeat
Pytorch
ONNX opset=14,17