Why is sigmoid activation for LRP not allowed? #1361

CloseChoice · 2024-10-01T15:50:17Z

❓ Questions and Help

I tried to a small model with a sigmoid activation but it's actually tested here that this does not work. Is there a specific reason for that? IMO since sigmoid is a scalar operation it should work analogously to ReLU and Tanh which can be used with LRP.

Simply adding sigmoid here yields the expected result. So why not just do so?

I would be willing to create the PR and add a test for this if there is no reason not to.

nicogross · 2025-02-04T19:41:21Z

LRP was designed for ReLU networks and generalized to leaky-ReLU.
my idea is, because the Sigmoid function does not satisfy f(0)=0 and sign(f(-x)) = -1 which leads to un-intuitive results, like following example:

f(x) = sigmoid( x1 w1+ x2 w2 ) = sigmoid(z1 + z2)
x1 = 2 and x2 = 1
w1 = -1 and w2 = 1
-> z1 = -2 and z2 = 1
f(x) = sigmoid( -2 +1 ) = sigmoid(-1) = 0.2689

x1 (or z1) pushes to a lower activation and x2 (or z2) pushes to a higher activation.
x1 should be assigned a small relevance and x2 should be assigned a greater relevance, but:
R1 = 0.5379 an R2 = -0.2689

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is sigmoid activation for LRP not allowed? #1361

Why is sigmoid activation for LRP not allowed? #1361

CloseChoice commented Oct 1, 2024

nicogross commented Feb 4, 2025 •

edited

Loading

Why is sigmoid activation for LRP not allowed? #1361

Why is sigmoid activation for LRP not allowed? #1361

Comments

CloseChoice commented Oct 1, 2024

❓ Questions and Help

nicogross commented Feb 4, 2025 • edited Loading

nicogross commented Feb 4, 2025 •

edited

Loading