Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

OliBomby · 2025-02-28T14:23:59Z

What does this PR do?

Fixes incorrect attention calculation when training Whisper with Flash Attention 2 and passing decoder_attention_mask with some values set to False.

This error might've been made when copying the same truncating code from the other attention implementations. The problem is that in WhisperFlashAttention2 the dimensions have been transposed.

cc @sanchit-gandhi @ylacombe

github-actions · 2025-02-28T14:24:11Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

Fix incorrect attention mask truncate in whisper flash attention

54feb17

github-actions bot marked this pull request as draft February 28, 2025 14:24

OliBomby marked this pull request as ready for review February 28, 2025 14:25

also fix incorrect attention mask truncate in qwen2 audio

28a4190

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

OliBomby commented Feb 28, 2025 •

edited

Loading

github-actions bot commented Feb 28, 2025

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Are you sure you want to change the base?

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Conversation

OliBomby commented Feb 28, 2025 • edited Loading

What does this PR do?

github-actions bot commented Feb 28, 2025

OliBomby commented Feb 28, 2025 •

edited

Loading