-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple test failure (working but totally wrong result) - need help #34
Comments
Hi Zoltan,
I don't have the time to go deeply in your question but I suspect that you have a UBM training problem.
Build a cross gender UBM with 3 speakers, 3 files is tricky and the config files are not designed for that.
(BTW: what is the number of gaussian components in your UBM ?)
Best
JF
De: "Zoltan Somogyi" <[email protected]>
À: "ALIZE-Speaker-Recognition" <[email protected]>
Cc: "Subscribed" <[email protected]>
Envoyé: Mardi 21 Avril 2020 12:38:02
Objet: [ALIZE-Speaker-Recognition/LIA_RAL] Simple test failure (working but totally wrong result) - need help (#34)
All data and config: [ https://github.com/ALIZE-Speaker-Recognition/LIA_RAL/files/4509265/test_project1-G3-lean.zip | test_project1-G3-lean.zip ]
I have made a very simple test with 3 speakers in which I make an UBM with all of the speech recordings from the 3 speakers and I adapt (train) a GMM model for each speaker with 3 (GD) distributions only (mixtureDistribCount=3). Then I test an input speech (one of the inputs) against the 3 speaker models and the UBM. The input are 2 wav's from Jennifer Lawrence, 2 from Natalie Portman and 3 from Will Smith. The input for the final identification/test is 'test_project1/audio/JenniferLawrence/voice1.wav' (the first audio from Jennifer Lawrence) and the Alize identification result is FALSE (can not recognize the input) with 'Will Smith' as best match which is of course completely wrong. The score is calculated with simple LLK and it results in a negative value ( -15.17):
test_project1/audio/JenniferLawrence/voice1.wav --> test_project1/prm//200421_100649_4c6f.init.prm
Writing to: 200421_100649_4c6f
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Total Number of frames in threads: 1809
Writing to: 200421_100649_4c6f
featureCount = 1809
spkCount = 3
UBMLoaded = 1
Identification result: FALSE, score: -15.1764, best matched uId: will_smith
Ready!
All data is included in the zip file. May I please ask one of you to run this simple test and let me know your result? Please let me also know if you find the reason for the wrong results.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, [ #34 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/AE4Z33Y7VCZQGK5CPJMVSELRNVZQVANCNFSM4MNGDOQA | unsubscribe ] .
…--
______________________________________________________________________
Jean-Francois BONASTRE
Directeur du LIA
LIA/CERI Université d'Avignon
Tel: +33/0 490843514
[email protected]
@jfbonastre
______________________________________________________________________
|
Hi Jean-Francois, Thank you very much for your answer! Best regards, |
I was able to make something which approximates what I wanted to achieve. The problems were the combination of the values of several variables in the config and the example source code in SimpleSpkDetSystem. Thank you! Great work! |
All data and config: test_project1-G3-lean.zip
I have made a very simple test with 3 speakers in which I make an UBM with all of the speech recordings from the 3 speakers and I adapt (train) a GMM model for each speaker with 3 (GD) distributions only (mixtureDistribCount=3). Then I test an input speech (one of the inputs) against the 3 speaker models and the UBM. The input are 2 wav's from Jennifer Lawrence, 2 from Natalie Portman and 3 from Will Smith. The input for the final identification/test is 'test_project1/audio/JenniferLawrence/voice1.wav' (the first audio from Jennifer Lawrence) and the Alize identification result is FALSE (can not recognize the input) with 'Will Smith' as best match which is of course completely wrong. The score is calculated with simple LLK and it results in a negative value ( -15.17):
All data is included in the zip file. May I please ask one of you to run this simple test and let me know your result? Please let me also know if you find the reason for the wrong results.
The text was updated successfully, but these errors were encountered: