A web application that provides both Speech-to-Text (STT) and Text-to-Speech (TTS) capabilities.
To run the Speech App locally, please make sure your environment meets the following prerequisites:
- Python 3.9 or lower is required to work with the DeepSpeech model.
- DeepSpeech model file (
deepspeech-0.9.3-models.pbmm) must be downloaded manually. - A virtual environment is recommended for isolating dependencies.
You need to use Python version 3.9 or below to work with DeepSpeech (deepspeech-0.9.3 version). You can use pyenv or another version manager to install Python 3.9.
To install Python 3.9, follow the instructions based on your operating system:
- macOS: Install Python using Homebrew
- Windows: Download Python 3.9 from the official website
- Linux: Use
aptoryumto install Python 3.9 based on your distribution.
# Create a virtual environment
python3.9 -m venv venv
# Activate the virtual environment
# For macOS/Linux:
source venv/bin/activate
# For Windows:
venv\Scripts\activate- Install Dependencies Once your virtual environment is activated, install the necessary dependencies.
pip install -r requirements.txt-
Download DeepSpeech Model
-
The DeepSpeech 0.9.3 model (deepspeech-0.9.3-models.pbmm) is required for the speech-to-text functionality. You need to download this model manually.
-
Download the Deepspeech 0.9.3 model from the official DeepSpeech GitHub repository:
-
After downloading, place the model file (deepspeech-0.9.3-models.pbmm) in the root directory of your project or specify the path to the model in your code.
-
-
Running the Application Once everything is installed, and the model is downloaded, run the Django development server.
python manage.py runserverThe application will be available at http://127.0.0.1:8000/.
- Additional Setup for Text-to-Speech (TTS) To use the Text-to-Speech functionality, you can use Hugging Face’s microsoft/speecht5_tts model. Make sure to install the transformers package if you haven't already:
pip install transformersYou may also need additional dependencies, such as sentencepiece, which is required by the SpeechT5 tokenizer
pip install sentencepiece