Make Muon optimizer easier to enable #7555

delock · 2025-09-11T07:57:54Z

The original Muon optimizer PR (#7509) requires user to explicitly set use_muon flags in model.parameters(), as shown in test https://github.com/deepspeedai/DeepSpeed/blob/master/tests/unit/ops/muon/test_muon.py#L27 .

This PR integrate setting of use_muon into DeepSpeed before engine initialization. This makes Muon optimizer easier to use. User only needs to change optimizer in config.json from AdamW to Muon, no need to change code. It will solve the following issue #7552

Signed-off-by: Ma, Guokai <[email protected]>

PKUWZP · 2025-09-13T03:03:53Z

@delock Can you take care of the unit test and make sure it passes?

delock · 2025-09-13T04:18:05Z

@delock Can you take care of the unit test and make sure it passes?

Yes, let me check UT. Thanks for reminding!

delock · 2025-09-13T06:54:57Z

Looks like nv-mii failed with an internet connection issue and modal-torch-latest failed on master recently. The failure in modal-torch-latest is DeepCompile failure, is it a known failure? @tohtana

tohtana · 2025-09-13T08:02:17Z

Hi @delock
Sorry for the issue. #7558 fixed it.

delock · 2025-09-14T09:13:43Z

Hi @delock Sorry for the issue. #7558 fixed it.

Thanks @tohtana

delock · 2025-09-14T09:16:46Z

Hi @loadams the remaining failure in nv-mii might be a file system failure. Probably no permission or some file missing.

OSError: [Errno 5] Input/output error: '/blob/hf_home/token'

delock added 2 commits September 11, 2025 12:53

auto apply muon flags in model.parameters() if optimizer is muon

4e8a475

Signed-off-by: Ma, Guokai <[email protected]>

Remove set_muon_flag in test code

6592071

Signed-off-by: Ma, Guokai <[email protected]>

delock requested review from tjruwase, loadams and tohtana as code owners September 11, 2025 07:57

delock force-pushed the gma/muon_improv branch from 3439b3a to 6592071 Compare September 11, 2025 07:59

delock changed the title ~~Make Muon optimizer easier to use~~ Make Muon optimizer easier to enable Sep 11, 2025

sfc-gh-truwase approved these changes Sep 12, 2025

View reviewed changes

Merge branch 'master' into gma/muon_improv

ac8c476

Merge branch 'master' into gma/muon_improv

edad01d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make Muon optimizer easier to enable #7555

Make Muon optimizer easier to enable #7555

Uh oh!

delock commented Sep 11, 2025 •

edited

Loading

Uh oh!

PKUWZP commented Sep 13, 2025

Uh oh!

delock commented Sep 13, 2025

Uh oh!

delock commented Sep 13, 2025

Uh oh!

tohtana commented Sep 13, 2025

Uh oh!

delock commented Sep 14, 2025

Uh oh!

delock commented Sep 14, 2025

Uh oh!

Uh oh!

Make Muon optimizer easier to enable #7555

Are you sure you want to change the base?

Make Muon optimizer easier to enable #7555

Uh oh!

Conversation

delock commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PKUWZP commented Sep 13, 2025

Uh oh!

delock commented Sep 13, 2025

Uh oh!

delock commented Sep 13, 2025

Uh oh!

tohtana commented Sep 13, 2025

Uh oh!

delock commented Sep 14, 2025

Uh oh!

delock commented Sep 14, 2025

Uh oh!

Uh oh!

delock commented Sep 11, 2025 •

edited

Loading