Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The configuration of the mcep-gmm model #6

Open
JingleiSHI opened this issue Jul 27, 2017 · 2 comments
Open

The configuration of the mcep-gmm model #6

JingleiSHI opened this issue Jul 27, 2017 · 2 comments

Comments

@JingleiSHI
Copy link

Hello,

I'm sorry to interrupt you, I'm interested in your mcep-gmm method to do the voice conversion and I have executed your program with my computer, but I find when I established a 10 Gaussian components model and a 50 Gaussian components model respectively, the results of converting wav file 20007.wav keep always unchangeable, so I want to ask how many Gaussian components and iteration you have used ?
Also, the number of Gaussian component in the command vc and the number of component in the command gmm should they keep the same number ?

Thank you for your attention
J.SHI

@albertaparicio
Copy link
Owner

albertaparicio commented Jul 27, 2017

Dear J.SHI,

First of all, I do not understand completely what you have tried to do.

In case it helps, in sptk_vc.sh, line 67, when I compute the gmm of the data, I use 32 components. Also, I am not aware of any need to keep the number of components in gmm and vc the same.

Let me also state that the GMM method for voice conversion in this project was only developed as a test to compare out development against the SPTK's tools. Since this project is based on Deep Learning, we did not put much energy on trying to obtain a good GMM model.

If you need any more help, do not hesitate to ask

@JingleiSHI
Copy link
Author

Dear albert,

Sorry for my unclear description, I mean that I run sptk_vc.sh to convert 20007.wav in the test data directory.

To train this model, I have chosen gmm -m 20 to train the model and get the result, and then I chose gmm -m 50 and got the result, when I compared these two results, I don't find the difference... In fact, the transformed sound isn't like the target sound at all. Is it normal ?

For the model DNN-LSTM-GRU, if I want to test my voice with this model, how should I do to prepare the test data, I have got the files of types .mep, .lf0, .fv, how to convert them to the type .dat ?
I didn't find the code in the project.

Thank you for your attention,
J.SHI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants