You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to a small model with a sigmoid activation but it's actually tested here that this does not work. Is there a specific reason for that? IMO since sigmoid is a scalar operation it should work analogously to ReLU and Tanh which can be used with LRP.
Simply adding sigmoid here yields the expected result. So why not just do so?
I would be willing to create the PR and add a test for this if there is no reason not to.
The text was updated successfully, but these errors were encountered:
LRP was designed for ReLU networks and generalized to leaky-ReLU.
my idea is, because the Sigmoid function does not satisfy f(0)=0 and sign(f(-x)) = -1 which leads to un-intuitive results, like following example:
x1 (or z1) pushes to a lower activation and x2 (or z2) pushes to a higher activation.
x1 should be assigned a small relevance and x2 should be assigned a greater relevance, but:
R1 = 0.5379 an R2 = -0.2689
❓ Questions and Help
I tried to a small model with a sigmoid activation but it's actually tested here that this does not work. Is there a specific reason for that? IMO since sigmoid is a scalar operation it should work analogously to ReLU and Tanh which can be used with LRP.
Simply adding sigmoid here yields the expected result. So why not just do so?
I would be willing to create the PR and add a test for this if there is no reason not to.
The text was updated successfully, but these errors were encountered: