Skip to content

support multiple outer optims for diloco #230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 12, 2025

Conversation

tushar00jain
Copy link
Contributor

@tushar00jain tushar00jain commented Jul 11, 2025

Summary:

  • support passing in a different outer optimizer for each fragment
  • currently accept both list of optimizers and a single optimizer for backward compatibility

Stack created with Sapling. Best reviewed with ReviewStack.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 11, 2025
@tushar00jain tushar00jain requested review from Copilot, d4l3k, H-Huang and dzmitry-huba and removed request for d4l3k July 11, 2025 22:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for specifying a different outer optimizer per fragment and introduces a fragment_update_alpha parameter to blend local and global updates. It also updates tests to cover the new alpha values and maintains backward compatibility with a single optimizer.

  • Allow outer_optimizer to be a list, enforcing one optimizer per fragment.
  • Save, clear, and merge local parameters using fragment_update_alpha.
  • Update integration tests to include alpha in parameterized configurations.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
torchft/local_sgd.py Added dictionaries and methods to save, merge, and clear local parameters; support list of optimizers; removed old alpha guard
torchft/local_sgd_integ_test.py Updated test configs and signatures to include alpha parameter

@@ -557,12 +598,6 @@ def __init__(
if fragment_update_alpha < 0 or fragment_update_alpha > 1:
raise ValueError("fragment_update_alpha must be between 0 and 1")

Copy link
Preview

Copilot AI Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fragment_update_alpha value is validated but never stored on the instance. Add self._fragment_update_alpha = fragment_update_alpha after the validation so _merge_parameters can access it.

Suggested change
self._fragment_update_alpha = fragment_update_alpha

Copilot uses AI. Check for mistakes.

Copy link
Member

@d4l3k d4l3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Summary:
- merge local and global parameters of the model after synchronization
- add the "alpha" parameter to integration tests

Test Plan:
```
pytest -vs ./torchft/local_sgd_integ_test.py
```
Summary:
- support passing in a different outer optimizer for each fragment
- currently accept both list of optimizers and a single optimizer for backward compatibility
@tushar00jain tushar00jain merged commit 4493204 into pytorch:main Jul 12, 2025
13 of 15 checks passed
@tushar00jain tushar00jain deleted the pr230 branch July 12, 2025 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants