Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

May I pull an user guide to this repo? #3

Open
HudsonHuang opened this issue Mar 31, 2017 · 5 comments
Open

May I pull an user guide to this repo? #3

HudsonHuang opened this issue Mar 31, 2017 · 5 comments

Comments

@HudsonHuang
Copy link

HudsonHuang commented Mar 31, 2017

MR albert, I love your work so much.
but it doesn't have a friendly user guide to use.
and....it may have some bugs in these codes?
these day I am trying to run these codes but I spent some time on how to use it. May I share my note and could you please spent a little time checking on it so that we could build it more friendly?
Here is my note, it seems someting wrong....(On step 6 when i using seq2seq) butI dont know where is the problem.

This project contains 3 solutions for voice conversion.
A MCEP-GMM based solution based on SPTK tools(Bsaeline solution of VCC2016: http://vc-challenge.org/summary.html)
A DNN-LSTM-GRU converting MVF-logf0-Mel-cepstrum
A seq2seq based MVF-logf0-Mel-cepstrum feature extraction conveting solution.

To run MCEP-GMM based solution based on SPTK tools:
edit sptk_vc.sh TRAIN_FILENAME to any files you need to convert
run sptk_vc.sh
the output wav is in data/training/gmm_vc

To run DNN-LSTM-GRU converting:
run lf0_lstm.py mvf_dnn.py mcp_gru.py to train different models to convert different features
[optional]run lf0_post_training.py mcp_post_training.py mvf_post_training.py mvf_plot_curves.py to verify model
run decode_aho.sh to merge the feature to wav

To run seq2seq:
0.[optional]you can get all the training file in ,put them in data/training/
1.apt-get intsall sox ,pip install tensorflow and so on
2.cp do_columns.pl to /usr/local/bin
3.get tfglib (https://github.com/albertaparicio/tfglib),edit seq2seq_datatable.py(maybe bug in para: nb_classes) ,and install
4.source install ahocoder, and add the file $ into your path
5.edit data/test/speakers.list (add more speakers if step 0 was procceed?)
6.run /data/train/seq2seq_align_training.sh and /data/test/seq2seq_align_test.sh 
7.run seq2seq.py
(There are some questions....that some file(like file 200007)wasn't extract .lf0.dat file and may throw an error)
8.run seq2seq_decode_prediction.py
@HudsonHuang HudsonHuang changed the title May I pull an using guide to this repo? May I pull an user guide to this repo? Mar 31, 2017
@albertaparicio
Copy link
Owner

Dear HudsonHuang,

This project currently has no user guide because it is unfinished. The same applies to the possible bugs in the code (it would help to tell me what they are anyway). This repository contains the code of my bachelor thesis' code, which is still being developed. Once it is finished (around May-June 2017), it will have a comprehensive user guide explaining how to use the code.

At the time of writing, the DNN-LSTM-GRU model is complete ('baseline' tag), as well as the SPTK model. The seq2seq model is the one under development.

The short guide you have written is useful to give users a fast way to see how to run the finished models. I must tell you, though, that the code in the seq2seq model is bound to radically change (we have been working with Keras, and are about to re-write the model from scratch with TensorFlow), so this part of the guide will change in a short future.

Regarding the error about file 200007, I remember having found a similar issue. I will take a look at it. Meanwhile, I suggest you skip this file.

Last, but not least, I want to thank you for your interest in my project. Stay tuned for the new developments (I hope there will be some interesting results)

Best wishes,

Albert

@HudsonHuang
Copy link
Author

Dear Albert,

Thank you for your comment.
My research is about text-to-speech system with specific people feature, so I have learn a lot from your project, thanks a lot.
And I am longing for the completion of the project and willing to give any help I can to build this project together.
Last, but not least, I think your project is the most advanced and complete open source DNN-based Voice-Conversion project. Thats great!

Best wishes,

Hudson

@JingleiSHI
Copy link

@HudsonHuang
Hi, sorry for interrupting you, thank you for your user guide which makes me understand Albert's code better, I have just a question about MCEP-GMM based solution, can I use this model to convert my own voice file, I mean that just add a voice file in the correspondent directory and then change TRAIN_FILENAME ?

Best wishes,
J.SHI

@HudsonHuang
Copy link
Author

@JingleiSHI

Yes, I think it would work.
Besides, it would be helpful to checkout the code here if any error happened.

Best wishes,

Hudson

@JingleiSHI
Copy link

@HudsonHuang

Thank you for your response, I hope I didn't bother you with my questions. I have run the program, but when I use a gmm model with -m 20, there will be a det = 0 problem (the inverse matrix can't be calculated), and I don't understand the command in the 73 line: vc -m 2 why m equals 2 but not 32 ? For the number of iteration, is it enough to use 100 iteration ?

Best wishes,
J.SHI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants