Hi, I want to pass softmax_scale factor to ring-attention (to use with cp), but it is currently not added in the patched _ring_compute_attention and not proprely patched in ring_fa3_varlen_func.
Desirrable behvaiour:
_ring_compute_attention should accept softmax_scale attribute
ring_fa3_varlen_func should accept softmax_scale attribute as well.
Thanks!
Hi, I want to pass
softmax_scalefactor to ring-attention (to use with cp), but it is currently not added in the patched_ring_compute_attentionand not proprely patched inring_fa3_varlen_func.Desirrable behvaiour:
_ring_compute_attentionshould acceptsoftmax_scaleattributering_fa3_varlen_funcshould acceptsoftmax_scaleattribute as well.Thanks!