Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is sigmoid activation for LRP not allowed? #1361

Open
CloseChoice opened this issue Oct 1, 2024 · 1 comment
Open

Why is sigmoid activation for LRP not allowed? #1361

CloseChoice opened this issue Oct 1, 2024 · 1 comment

Comments

@CloseChoice
Copy link

❓ Questions and Help

I tried to a small model with a sigmoid activation but it's actually tested here that this does not work. Is there a specific reason for that? IMO since sigmoid is a scalar operation it should work analogously to ReLU and Tanh which can be used with LRP.

Simply adding sigmoid here yields the expected result. So why not just do so?

I would be willing to create the PR and add a test for this if there is no reason not to.

@nicogross
Copy link

nicogross commented Feb 4, 2025

LRP was designed for ReLU networks and generalized to leaky-ReLU.
my idea is, because the Sigmoid function does not satisfy f(0)=0 and sign(f(-x)) = -1 which leads to un-intuitive results, like following example:

f(x) = sigmoid( x1 w1+ x2 w2 ) = sigmoid(z1 + z2)
x1 = 2 and x2 = 1
w1 = -1 and w2 = 1
-> z1 = -2 and z2 = 1
f(x) = sigmoid( -2 +1 ) = sigmoid(-1) = 0.2689

x1 (or z1) pushes to a lower activation and x2 (or z2) pushes to a higher activation.
x1 should be assigned a small relevance and x2 should be assigned a greater relevance, but:
R1 = 0.5379 an R2 = -0.2689

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants