The attention scores are always None
in CachedMultiHeadAttention
#2055
Labels
type:Bug
Something isn't working
None
in CachedMultiHeadAttention
#2055
Describe the bug
The variable
attention_scores
introduced at line 111 is alwaysNone
.To Reproduce
Since it is an internal variable, I copied the subclass CMHA in this script:
https://colab.research.google.com/drive/1ZUS4mjDQktovKiJ8TQ7zYtm4PGjesXvG?usp=sharing
Expected behavior
The variable
attention_scores
should contain the cross correlation between query and key, which is useful for debugging a model IMHO.Additional context
In recent Keras versions, the parent class
MultiHeadAttention
saves the argumentreturn_attention_scores
inself._return_attention_scores
.Then, the method
_compute_attention
checks this private property to decide whether or not to return the scores.Since this state is not updated in
CachedMultiHeadAttention.call
, the attention scores will never be returned.I'll also submit an issue to Keras to turn the attribute
_return_attention_scores
into an argument.Would you like to help us fix it?
Yes, I have two potential fixes:
_return_attention_scores
accordinglyWDYT?
The text was updated successfully, but these errors were encountered: