-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for alignment output in tsv format #407
base: master
Are you sure you want to change the base?
Conversation
I've been trying this out. Looks like when using a long text some of the last words are being skipped in the alignment file. |
@vytskalt can you provide an example so I can debug/fix it? |
Yes, this is the command I'm running: cat text.txt | piper --sentence-silence 0.5 -m en_US-ryan-high --output_file out.wav --alignment-data alignment.tsv This is the text (random Reddit post): text.txt In the alignment.tsv, 2 of the last words are missing. |
ok, it's not the length that is the issue, it's the content. For example: "musical/sport" will be spoken as 3 words. "in the" is mangled into one spoken word. My word/phoneme sync trips over this. Needs to be fixed, I have to find another way to sync. |
… or split by "musical/sports". Also fixed missing sentence silence in calculation
Hi, i pulled this pull request and make a build but the --ali gnment-data is not disponible in the executable "piper" in the install folder. Am i missing something to make it work ? Thanks (: |
It is only built into the python script, not in the c++ executable. |
Make sense ! Thanks (: |
Support of alignment data output.
Kind of matching on issue #364
Can be used as a base for #391 and #361
Runs text to speech 2 times, one for normal audio generation,
a second time for each word.
Since both produce different outputs and times, a correction is applied.
Not perfect, but "good enough". Both will self sync after each sentence, so only slight offset are created.