Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthesize can only process even multiples of the batch size #438

Open
roedoejet opened this issue May 27, 2024 · 1 comment · May be fixed by #538
Open

Synthesize can only process even multiples of the batch size #438

roedoejet opened this issue May 27, 2024 · 1 comment · May be fixed by #538
Assignees
Labels
bug Something isn't working
Milestone

Comments

@roedoejet
Copy link
Member

I provided 34 utterances in a filelist with a batch size of 4 and only 32 outputs were produced. It seems that the dataloader only processes full batches.

@roedoejet roedoejet added the bug Something isn't working label May 27, 2024
@roedoejet roedoejet added this to the alpha milestone May 27, 2024
@roedoejet roedoejet self-assigned this Jun 10, 2024
@roedoejet roedoejet modified the milestones: alpha, beta Jun 20, 2024
@wiitt
Copy link
Collaborator

wiitt commented Jun 24, 2024

I've tried synthesizing speech with a specified batch size from a filelist of txt and psv formats (with 31 utterances and batch size=7). For all the utterances I had, I got corresponding audios in both cases. During training, the dataloader does discard incomplete patches.

I developed the sampler which allows filling an incomplete batch with random samples from other batches. Simply training two models following different sampling approaches doesn't make the difference obvious. I restricted the LJ data to only 168 utterances to have a greater significance of discarded data and imitate a case of a low-resource language. I cannot say that one of these models is better than another.

Do you have any ideas how to test the effectiveness of keeping an oversampled last batch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants