-
Notifications
You must be signed in to change notification settings - Fork 37
support multiple outer optims for diloco #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for specifying a different outer optimizer per fragment and introduces a fragment_update_alpha
parameter to blend local and global updates. It also updates tests to cover the new alpha values and maintains backward compatibility with a single optimizer.
- Allow
outer_optimizer
to be a list, enforcing one optimizer per fragment. - Save, clear, and merge local parameters using
fragment_update_alpha
. - Update integration tests to include
alpha
in parameterized configurations.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
torchft/local_sgd.py | Added dictionaries and methods to save, merge, and clear local parameters; support list of optimizers; removed old alpha guard |
torchft/local_sgd_integ_test.py | Updated test configs and signatures to include alpha parameter |
@@ -557,12 +598,6 @@ def __init__( | |||
if fragment_update_alpha < 0 or fragment_update_alpha > 1: | |||
raise ValueError("fragment_update_alpha must be between 0 and 1") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fragment_update_alpha
value is validated but never stored on the instance. Add self._fragment_update_alpha = fragment_update_alpha
after the validation so _merge_parameters
can access it.
self._fragment_update_alpha = fragment_update_alpha |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary: - merge local and global parameters of the model after synchronization - add the "alpha" parameter to integration tests Test Plan: ``` pytest -vs ./torchft/local_sgd_integ_test.py ```
Summary: - support passing in a different outer optimizer for each fragment - currently accept both list of optimizers and a single optimizer for backward compatibility
Summary:
Stack created with Sapling. Best reviewed with ReviewStack.