Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
-
Updated
Dec 5, 2023 - Python
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
Free AI & Community powered Learning Experience
Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine
Extractive Question-Answering with BERT on SQuAD v2.0 (Stanford Question Answering Dataset) using NVIDIA PyTorch Lightning
📄 SmartSRT is a command-line tool for generating accurate subtitles with per-word timestamps. It uses WhisperAI for speech transcription, NVIDIA NeMo for diarization, and OpenCV for face recognition. The program is good at creating high accuracy subtitles. 🎧💻⚙️
Post-training quantization on Nvidia Nemo ASR model
Automatic transcriber made with the Nvidia NeMo AI toolkit. Used to transcribe speech to text in real-time from any source. Requires CUDA capable GPU to run on the local machine, if setup using virtual audio cables can transcribe the audio that is being played in real-time without any other requirements.
This bootcamp is designed to give NLP researchers an end-to-end overview on the fundamentals of NVIDIA NeMo framework, complete solution for building large language models. It will also have hands-on exercises complimented by tutorials, code snippets, and presentations to help researchers kick-start with NeMo LLM Service and Guardrails.
LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.
Training and Tunning a Text to speech model with Nvidia NeMo and Weights and Biases
The simplest & most comprehensible tutorial on speaker identification with NVIDIA's `Nemo`.
Module for russian speech recognition using NVIDIA Nemo.
Audio profanity detector desktop app developed with PyQt5 using NVidia-Nemo tech.
PodcastProject Analytics Toolkit - Project that creates analytics various input data. Exported data is intended to be used in a PodcastProject website
Implementation of a Kazakh Speech-to-Text Model using the NVIDIA NeMo toolkit for efficient transcription of spoken Kazakh speech into text.
Add a description, image, and links to the nvidia-nemo topic page so that developers can more easily learn about it.
To associate your repository with the nvidia-nemo topic, visit your repo's landing page and select "manage topics."