-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extra loss terms before loss.backward()
seem to have no effects
#249
Comments
@kenziyuliu Thanks for reporting! Indeed, there is an issue with GradSampleModule implementation leading to incorrect gradients. In your example you use squared sum of the parameters (model = nn.Linear):
GradSampleModule collects grad_samples using Module backward_hooks. When the left part is computed, Module backward_hook is called because in the expression we call the Thanks this message for the tip. It also recommends using Tensor hooks instead of Module hooks. I think it's a good idea and we should redesign grad sampler to use Tensor hooks. |
Hi @romovpa, thanks for the reply! Would there be a workaround for now with minor code changes? |
@kenziyuliu From what comes to mind, as a workaround it may be possible to create a custom Module and implement a grad sampler for it. Something like: model = RegularizedLinear()
y_hat, regularizer = model(x)
loss = criterion(y_hat, y) + regularizer @ffuuugor @karthikprasad Coud you check it this is viable? Can we handle multiple outputs? Created a separate issue for the bigger grad sampler problem #259 |
@romovpa: I don't believe this will work because it will create "per-sample" gradients of the regularizer. I believe something along these lines should work: def new_step(self, is_empty):
self.original_step(is_empty)
for (p, p_another) in zip(model.parameters(), another_model.parameters()):
p.grad += lambda_regularizer * (p - p_another)
engine.original_step = engine.step
engine.step = types.MethodType(new_step, engine) |
Hi @alexandresablayrolles, thanks for the response! It seems that the API has changed for the |
Hi Opacus team, just bumping this issue -- would this by any chance be resolved / have a clean work-around in the new versions of Opacus? I know with |
Hi, @kenziyuliu have you found a way to solve this problem? Is there a workaround to add an extra term to the loss before loss.backward()? |
Hi @lucacorbucci I haven't tried since then, though I believe the new |
Thank you @kenziyuliu! I'll try with functorch |
Hi, has anyone tried this (add an extra term to the loss before loss.backward()) since and do you know if it works? Thank you very much |
🐛 Bug
Extra loss terms before
loss.backward()
do not seem to have effects whenprivacy_engine
is in use. One use case this would be blocking is when we want to regularize model weights towards another set of weights (e.g. multi-task learning regularization), or other weight-based regularization techniques.Please reproduce using our template Colab and post here the link
https://colab.research.google.com/drive/1TyZMh4IgkB8qTak1JqYpBFMrrE_x1Rbp?usp=sharing
privacy_engine
respectivelyTo Reproduce
privacy_engine
attached, I would expect the extra loss term (1st code cell) to have an effect on model learningprivacy_engine
is enabled, the extra loss term is not taken into accountExpected behavior
When we add loss terms before backprop, e.g.,
the extra loss would reflect into training. However, when we use
privacy_engine
the extra loss terms seem to have no effect, and this is unexpected since we only clip and noise gradients corresponding to the training examples.Environment
The issue should be reproducible in the provided colab notebook
The text was updated successfully, but these errors were encountered: