This repository provides a Dockerized setup for running ChatTTS, a text-to-speech model designed for dialogue applications, with GPU support. The setup is optimized for ease of use and performance.
-
Install Docker on Ubuntu 22.04
Follow the guide to install Docker:
How to Install and Use Docker on Ubuntu 22.04 -
Perform Post-Installation Steps for Docker
Ensure you complete the post-installation steps as outlined here:
Post-Installation Steps for Docker on Linux -
Install NVIDIA Container Toolkit (For GPU Users)
If you're using a GPU, install the NVIDIA Container Toolkit by referring to:
NVIDIA Container Toolkit Installation Guide -
Install Docker Compose on Ubuntu 22.04
Set up Docker Compose using the instructions here:
How to Install and Use Docker Compose on Ubuntu 22.04
You can pull the pre-built Docker image from Docker Hub:
docker pull naren200/chatts-dockerfile:v1-
Clone this repository:
git clone https://github.com/your-username/chatTTS-dockerfile.git cd chatTTS-dockerfile -
Start the Docker container:
./start_docker.sh
This script will:
- Start the container in attached mode.
-
Once inside the container, you can run the
talk.pyscript to generate speech:python3 talk.py
The output audio files will be saved as
basic_output0.wav,basic_output1.wav, etc., in the working directory.
To change the text that is converted to speech, edit the talk.py file:
-
Open
chaTTS-dockerfile/talk.pyin your preferred text editor. -
Modify the
inputs_enstring with your desired text. -
Save the file, the saved file will automatically be reflected inside the docker image. Rerun the script:
python3 talk.py
To stop and remove the Docker container, run:
./stop_docker.shIf you want to build the Docker image from scratch or modify the setup:
-
Edit the
Dockerfileto add or remove dependencies. -
Build the Docker image:
docker build -t naren200/chatts-dockerfile:v1 . -
Update the
docker-compose.ymlfile if needed (e.g., to change volume mounts or environment variables).
- The
talk.pyscript is preconfigured to use GPU acceleration. Ensure your system has a compatible NVIDIA GPU. - The
inputs_enstring intalk.pysupports special tags like[uv_break]for pauses and[laugh]for laughter. Refer to the ChatTTS documentation for more details. - If you encounter issues with torchaudio, try switching between the two
torchaudio.savelines intalk.py.
This project is licensed under the MIT License. See the LICENSE file for details.