[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

mirfan899 · 2024-12-07T14:16:30Z

Describe the bug

I'm running tts on spanish and hindi language. The audio generated are better by previous repo. New repo seems to add female mixture to generated audio.

Here is the sample audio and the output generated by both repo.

I'm using the same code for both.

import torch
from TTS.api import TTS

# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

tts.tts_to_file(text="After this, we will talk about a player who will give a lot of respect to Indian fans in this world cup. I will just tell you about this player who announced his entry in International Cricket and did a great performance. From today, we will meet you every day at EAM Cricket World Cup 2007.", speaker_wav="../audio/urdu.wav", language="en", file_path="output.wav")

data.zip

To Reproduce

Install old TTS and new TTS separately and test given code.

import torch
from TTS.api import TTS

# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

tts.tts_to_file(text="After this, we will talk about a player who will give a lot of respect to Indian fans in this world cup. I will just tell you about this player who announced his entry in International Cricket and did a great performance. From today, we will meet you every day at EAM Cricket World Cup 2007.", speaker_wav="../audio/urdu.wav", language="en", file_path="output.wav")

Expected behavior

voice quality not good.

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "Tesla T4"
        ],
        "available": true,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.5.1+cu121",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.12",
        "version": "#1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024"
    }
}

Additional context

No response

eginhard · 2024-12-07T18:50:08Z

Thank you for the report! There is indeed an issue with voice cloning and I've identified the source. I should be able to release a fix in ~2 days. In the meantime you can use version 0.24.3 of this fork.

eginhard · 2024-12-09T17:00:45Z

Sorry about this. It should be back to normal now in version 0.25.1

mirfan899 added the bug Something isn't working label Dec 7, 2024

eginhard changed the title ~~[Bug] Old TTS gives better results as compared to Forked TTS~~ [Bug] Voice cloning issue in coqui-tts 0.25.0 Dec 7, 2024

eginhard self-assigned this Dec 7, 2024

eginhard pinned this issue Dec 7, 2024

erew123 mentioned this issue Dec 8, 2024

Really bad inference on newest Alltalk_v2 erew123/alltalk_tts#450

Closed

eginhard mentioned this issue Dec 8, 2024

Fix XTTS voice cloning #199

Merged

eginhard closed this as completed in #199 Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

mirfan899 commented Dec 7, 2024

eginhard commented Dec 7, 2024

eginhard commented Dec 9, 2024

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

Comments

mirfan899 commented Dec 7, 2024

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

eginhard commented Dec 7, 2024

eginhard commented Dec 9, 2024