-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Thanks @andrusenkoau for the recent phrase boosting features in ASR. I've tried using that for a while but only was able to boost phrases with AED decoding (Canary-1b). It does not really work for Parakeet v2 or I am missing something. WER gets worst as boosting_tree_alpha increases.
os.environ["MODEL_NAME"] = "/home/jovyan/.cache/huggingface/hub/models--nvidia--parakeet-tdt-0.6b-v2/snapshots/f1cd6697a2ec38060af8a36ce206a4ad9c4467bc/parakeet-tdt-0.6b-v2.nemo"
os.environ["BATCH_SIZE"] = "1"
os.environ["BEAM_SIZE"] = "5"
os.environ["KEY_WORDS_LIST"] = cb_list_file
os.environ["HYDRA_FULL_ERROR"] = "1"
!python examples/asr/speech_to_text_eval.py \
model_path=${MODEL_NAME} \
dataset_manifest=./context_biasing_data/gtc_data_subset_10f.json \
batch_size=${BATCH_SIZE} \
output_filename=boosted_output.txt \
text_processing.do_lowercase=true \
text_processing.rm_punctuation=true \
rnnt_decoding.strategy="malsd_batch" \ # greedy, greedy_batch
# rnnt_decoding.beam.beam_size=${BEAM_SIZE} \
rnnt_decoding.beam.boosting_tree.key_phrases_file=${KEY_WORDS_LIST} \
rnnt_decoding.beam.boosting_tree.context_score=1.0 \
rnnt_decoding.beam.boosting_tree.depth_scaling=2.0 \
rnnt_decoding.beam.boosting_tree_alpha=0.6
Phrase-boosting data test: https://asr-tutorial-data.s3.eu-north-1.amazonaws.com/context_biasing_data.gz from ASR_Context_Biasing.ipynb. By the way if we could have a tutorial like the current ASR_Context_Biasing.ipynb, that would be great.
boost_keywords.txt:
gpu
nvidia
nvidia's
nvlink
omniverse
cunumeric
numpy
dgx
dgxs
dlss
cpu
tsmc
culitho
xlabs
tensorrt
tensorflow
pytorch
aws
chatgpt
pcie
WER and CER results:
boosting_tree_alpha outputs
0.0 Dataset WER/CER 12.06%/4.96%
0.1 Dataset WER/CER 13.07%/5.05%
0.2 Dataset WER/CER 13.07%/5.05%
0.3 Dataset WER/CER 12.06%/4.96%
0.4 Dataset WER/CER 12.06%/4.96%
0.5 Dataset WER/CER 13.07%/6.03%
0.6 Dataset WER/CER 13.07%/6.03%
0.7 Dataset WER/CER 14.07%/6.12%
0.8 Dataset WER/CER 14.07%/6.12%
1.0 Dataset WER/CER 14.07%/6.38%
2.0 Dataset WER/CER 22.61%/14.98%
Environment details
If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:
- OS version: Linux jupyterserver-6f8d46c9c4-6s6nj 4.18.0-553.40.1.el8_10.x86_64
- PyTorch version: '2.7.1+cu126'
- Python version: 3.11
- GPU: L4
- branch: main
Jacob-Bishop
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working