Attention vs Add in LKA #32

iumyx2612 · 2022-11-30T02:09:01Z

In table 3, changing attention (mul) to add reduces VAN performance from 75.4 to 74.6. I think this is really huge. However, in the ablation study, you stated that "Besides, replacing attention with adding operation is also not achieving a lower accuracy". Is it okay to say it like that since the performance drop is 0.8

Can't treat add as a type of attention function? In Attention Mechanisms in Computer Vision: A Survey, we have the formula:

I can treat function f here is an addition operation can't I?

iumyx2612 · 2023-01-13T05:59:07Z

@MenghaoGuo Hello can you explain this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention vs Add in LKA #32

Attention vs Add in LKA #32

iumyx2612 commented Nov 30, 2022

iumyx2612 commented Jan 13, 2023

Attention vs Add in LKA #32

Attention vs Add in LKA #32

Comments

iumyx2612 commented Nov 30, 2022

iumyx2612 commented Jan 13, 2023