-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use default value of initial_scale_power if FP16 scaling params not provided #4986
Use default value of initial_scale_power if FP16 scaling params not provided #4986
Conversation
…rovided The dynamic_loss_scale_args is None if some scaling param is not specified in the config: https://github.com/microsoft/DeepSpeed/blob/9d2660d2a3fac767972f01ac96858b2605ffc0e4/deepspeed/runtime/config.py#L215 In that case, it seems like DeepSpeed is using 2**32 as the initial_scale instead of the 2**16 as specified in the docs here: https://github.com/microsoft/DeepSpeed/blob/9d2660d2a3fac767972f01ac96858b2605ffc0e4/deepspeed/runtime/config.py#L215
@ShukantPal - I know this PR is old, but I see the following error now on this PR:
|
Hi @loadams, I no longer have the bandwidth to support this PR (switched jobs :)). Feel free to close if this change is no longer applicable. |
No problem @ShukantPal - thanks for the update. I'll close this PR and open an issue to track the bug and make the needed fixes. |
Signed-off-by: Olatunji Ruwase <[email protected]>
The dynamic_loss_scale_args is None if some scaling param is not specified in the config: https://github.com/microsoft/DeepSpeed/blob/9d2660d2a3fac767972f01ac96858b2605ffc0e4/deepspeed/runtime/config.py#L215
In that case, it seems like DeepSpeed is using 2^32 as the initial_scale instead of the 2^16 as specified in the docs here: https://www.deepspeed.ai/docs/config-json/#fp16-training-options