Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

Closed
mirfan899 opened this issue Dec 7, 2024 · 2 comments · Fixed by #199
Closed

[Bug] Voice cloning issue in coqui-tts 0.25.0 #198

mirfan899 opened this issue Dec 7, 2024 · 2 comments · Fixed by #199
Assignees
Labels
bug Something isn't working

Comments

@mirfan899
Copy link

Describe the bug

I'm running tts on spanish and hindi language. The audio generated are better by previous repo. New repo seems to add female mixture to generated audio.

Here is the sample audio and the output generated by both repo.

I'm using the same code for both.

import torch
from TTS.api import TTS

# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

tts.tts_to_file(text="After this, we will talk about a player who will give a lot of respect to Indian fans in this world cup. I will just tell you about this player who announced his entry in International Cricket and did a great performance. From today, we will meet you every day at EAM Cricket World Cup 2007.", speaker_wav="../audio/urdu.wav", language="en", file_path="output.wav")

data.zip

To Reproduce

Install old TTS and new TTS separately and test given code.

import torch
from TTS.api import TTS

# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

tts.tts_to_file(text="After this, we will talk about a player who will give a lot of respect to Indian fans in this world cup. I will just tell you about this player who announced his entry in International Cricket and did a great performance. From today, we will meet you every day at EAM Cricket World Cup 2007.", speaker_wav="../audio/urdu.wav", language="en", file_path="output.wav")

Expected behavior

voice quality not good.

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "Tesla T4"
        ],
        "available": true,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.5.1+cu121",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.12",
        "version": "#1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024"
    }
}

Additional context

No response

@mirfan899 mirfan899 added the bug Something isn't working label Dec 7, 2024
@eginhard eginhard changed the title [Bug] Old TTS gives better results as compared to Forked TTS [Bug] Voice cloning issue in coqui-tts 0.25.0 Dec 7, 2024
@eginhard eginhard self-assigned this Dec 7, 2024
@eginhard
Copy link
Member

eginhard commented Dec 7, 2024

Thank you for the report! There is indeed an issue with voice cloning and I've identified the source. I should be able to release a fix in ~2 days. In the meantime you can use version 0.24.3 of this fork.

@eginhard
Copy link
Member

eginhard commented Dec 9, 2024

Sorry about this. It should be back to normal now in version 0.25.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants