Video Dubbing and Transcription Tool

This project provides tools for generating dubbed audio from subtitles (SRT files) and merging the dubbed audio with the original video. It also includes a script for transcribing video content using OpenAI's Whisper model.

Features

Text-to-Speech (TTS) Dubbing: Generate dubbed audio from SRT files using a TTS model.
Video Dubbing: Merge dubbed audio with the original video, applying auto-ducking to reduce the volume of the original audio during dubbed segments.
Video Transcription: Transcribe video content into SRT format using OpenAI's Whisper model.

Installation and Requirements

Dependencies

To install the necessary Python dependencies, run:

pip install -r requirements.txt

The requirements.txt file includes the following dependencies:

torch>=1.10.0
torchaudio>=0.10.0
coqui-ai/TTS>=0.0.13
openai-whisper>=1.0.0
toml>=0.10.2
numpy>=1.21.0
soundfile>=0.12.1
srt>=3.5.0
cached-path>=0.2.0
omegaconf>=2.3.0

System Requirements

FFmpeg: Required for the mix_audio.sh script. Install it using your system's package manager:
- Ubuntu/Debian:
```
sudo apt-get install ffmpeg
```
- macOS:
```
brew install ffmpeg
```
- Windows: Download and install from the official FFmpeg website.

Usage

1. Transcribe Video Content

Use the whisper.sh script to transcribe a video into an SRT file:

./scripts/whisper.sh path/to/video.mp4

This will generate a transcript.srt file in the current directory.

2. Generate Dubbed Audio

Use the srt2voice.py script to generate dubbed audio from an SRT file:

python srt2voice.py -c config.toml

The config.toml file specifies the TTS model, reference audio, and other settings. Here's an example configuration:

# F5-TTS | E2-TTS
model = "F5TTS_v1_Base"
ref_audio = "./assets/voice.flac"
ref_text = ""
gen_text = ""
gen_file = "transcript.srt"
remove_silence = false
output_dir = "dubbed"
speed = 1

3. Merge Dubbed Audio with Video

Use the mix_audio.sh script to merge the dubbed audio with the original video:

./scripts/mix_audio.sh path/to/video.mp4 path/to/dubbed_audio.wav

This script will:

Extract the original audio from the video.
Apply auto-ducking to the original audio.
Merge the dubbed audio with the video.
Save the final dubbed video in the output_video directory.

Example Workflow

Transcribe the video:
```
./scripts/whisper.sh path/to/video.mp4
```
Generate the dubbed audio:
```
python srt2voice.py -c config.toml
```

Merge the dubbed audio with the video:

./scripts/mix_audio.sh path/to/video.mp4 path/to/dubbed_audio.wav

This will produce a final video file with the dubbed audio synchronized with the original video.

Configuration

The config.toml file allows you to customize the TTS model, reference audio, and other settings. Refer to the story-sample.toml file for an example configuration.

Contributing

If you'd like to contribute to this project, please fork the repository and submit a pull request. Ensure your changes are well-documented and tested.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Dubbing and Transcription Tool

Features

Installation and Requirements

Dependencies

System Requirements

Usage

1. Transcribe Video Content

2. Generate Dubbed Audio

3. Merge Dubbed Audio with Video

Example Workflow

Configuration

Contributing

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
dubbed		dubbed
output_video		output_video
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
srt2voice.py		srt2voice.py
story-sample.toml		story-sample.toml

License

eddieoz/srt2voice

Folders and files

Latest commit

History

Repository files navigation

Video Dubbing and Transcription Tool

Features

Installation and Requirements

Dependencies

System Requirements

Usage

1. Transcribe Video Content

2. Generate Dubbed Audio

3. Merge Dubbed Audio with Video

Example Workflow

Configuration

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages