Skip to content

fix: preserve speaker_encoder in checkpoints and improve sample rate error#222

Open
haosenwang1018 wants to merge 1 commit intoQwenLM:mainfrom
haosenwang1018:fix/preserve-speaker-encoder-in-checkpoints
Open

fix: preserve speaker_encoder in checkpoints and improve sample rate error#222
haosenwang1018 wants to merge 1 commit intoQwenLM:mainfrom
haosenwang1018:fix/preserve-speaker-encoder-in-checkpoints

Conversation

@haosenwang1018
Copy link

Problem

Two issues in the finetuning script (sft_12hz.py / dataset.py):

1. speaker_encoder deleted from checkpoints — resume impossible

Lines 150-153 of sft_12hz.py explicitly delete speaker_encoder weights before saving:

drop_prefix = "speaker_encoder"
keys_to_drop = [k for k in state_dict.keys() if k.startswith(drop_prefix)]
for k in keys_to_drop:
    del state_dict[k]

When resuming from a checkpoint, model.speaker_encoder = None → crash on the next forward pass.

2. Unhelpful sample rate error message

dataset.py:105 asserts sr == 24000 but doesn't show the detected sample rate or suggest a fix, leaving users confused.

Fix

  1. Remove the speaker_encoder deletion — keep weights in checkpoints so resume works. A comment explains the rationale.
  2. Improve the assertion to show the actual sample rate and suggest the ffmpeg conversion command.

Fixes #204

…error

Two fixes for the finetuning script:

1. Stop deleting speaker_encoder weights from checkpoints.
   The deletion makes resume training impossible because the model
   crashes on the next forward pass with NoneType error.
   A comment explains the rationale for keeping them.

2. Improve the sample rate assertion error message to show the
   actual detected sample rate and suggest the ffmpeg conversion
   command, reducing user confusion.

Fixes QwenLM#204

Signed-off-by: haosenwang1018 <haosenwang1018@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant