Synthesize can only process even multiples of the batch size #438

roedoejet · 2024-05-27T16:31:10Z

I provided 34 utterances in a filelist with a batch size of 4 and only 32 outputs were produced. It seems that the dataloader only processes full batches.

wiitt · 2024-06-24T22:03:53Z

I've tried synthesizing speech with a specified batch size from a filelist of txt and psv formats (with 31 utterances and batch size=7). For all the utterances I had, I got corresponding audios in both cases. During training, the dataloader does discard incomplete patches.

I developed the sampler which allows filling an incomplete batch with random samples from other batches. Simply training two models following different sampling approaches doesn't make the difference obvious. I restricted the LJ data to only 168 utterances to have a greater significance of discarded data and imitate a case of a low-resource language. I cannot say that one of these models is better than another.

Do you have any ideas how to test the effectiveness of keeping an oversampled last batch?

Fixes #438

roedoejet added the bug Something isn't working label May 27, 2024

roedoejet added this to the alpha milestone May 27, 2024

roedoejet assigned wiitt Jun 3, 2024

roedoejet self-assigned this Jun 10, 2024

roedoejet modified the milestones: alpha, beta Jun 20, 2024

roedoejet removed their assignment Jul 4, 2024

wiitt added a commit that referenced this issue Aug 30, 2024

fix: updates the training sampling strategy to complete the last batch

ad8cd7f

Fixes #438

wiitt linked a pull request Aug 30, 2024 that will close this issue

[WIP] fix: updates the training sampling strategy to complete the last batch #538

Draft

joanise mentioned this issue Sep 9, 2024

When preprocessing, or synthesizing, the number of outputs doesn't always match the number of items in the filelist #417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synthesize can only process even multiples of the batch size #438

Synthesize can only process even multiples of the batch size #438

roedoejet commented May 27, 2024

wiitt commented Jun 24, 2024

Synthesize can only process even multiples of the batch size #438

Synthesize can only process even multiples of the batch size #438

Comments

roedoejet commented May 27, 2024

wiitt commented Jun 24, 2024