[train_custom_diffusion.py] Fix the LR schedulers when `num_train_epochs` is passed in a distributed training env #9308

AnandK27 · 2024-08-29T00:52:17Z

What does this PR do?

Part of #8384

Test Script

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export OUTPUT_DIR="cat_model"
export INSTANCE_DIR="./data/cat"

accelerate launch train_custom_diffusion.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --class_data_dir=./real_reg/cat/ \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --class_prompt="cat" --num_class_images=20 \
  --instance_prompt="photo of a <new1> cat"  \
  --resolution=512  \
  --train_batch_size=2  \
  --learning_rate=1e-5  \
  --lr_warmup_steps=0 \
  --num_train_epochs=5 \
  --enable_xformers_memory_efficient_attention \
  --scale_lr --hflip  \
  --modifier_token "<new1>" \
  --validation_prompt="<new1> cat sitting in a bucket" \
  --report_to="wandb"

Fixes # (issue)
Fixed the num_train_epoch for train_custom_diffusion.py and a small fix to saving the text token during safe serialization.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sayakpaul and @geniuspatrick
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

…ain_epochs`

sayakpaul

Looks clean. Thank you!

HuggingFaceDocBuilderDev · 2024-08-29T02:02:38Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

AnandK27 · 2024-08-29T03:41:04Z

Hey @sayakpaul,
Fixed the formatting issue, the tests should pass now!

AnandK27 · 2024-08-29T06:00:02Z

@sayakpaul, it can be merged ig

sayakpaul · 2024-08-29T08:53:41Z

Thanks for your contributions!

…chs` is passed in a distributed training env (#9308) * Update train_custom_diffusion.py to fix the LR schedulers for `num_train_epochs` * Fix saving text embeddings during safe serialization * Fixed formatting

AnandK27 added 2 commits August 28, 2024 16:48

Update train_custom_diffusion.py to fix the LR schedulers for `num_tr…

f7ebb2a

…ain_epochs`

Fix saving text embeddings during safe serialization

9e8d344

sayakpaul approved these changes Aug 29, 2024

View reviewed changes

AnandK27 added 2 commits August 28, 2024 19:26

Merge branch 'main' into AnandK27-custom-diffusion-LR-fix

691b693

Fixed formatting

94e04e1

sayakpaul merged commit 40c13fe into huggingface:main Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[train_custom_diffusion.py] Fix the LR schedulers when `num_train_epochs` is passed in a distributed training env #9308

[train_custom_diffusion.py] Fix the LR schedulers when `num_train_epochs` is passed in a distributed training env #9308

Uh oh!

AnandK27 commented Aug 29, 2024 •

edited

Loading

Uh oh!

sayakpaul left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 29, 2024

Uh oh!

AnandK27 commented Aug 29, 2024

Uh oh!

AnandK27 commented Aug 29, 2024

Uh oh!

sayakpaul commented Aug 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[train_custom_diffusion.py] Fix the LR schedulers when num_train_epochs is passed in a distributed training env #9308

[train_custom_diffusion.py] Fix the LR schedulers when num_train_epochs is passed in a distributed training env #9308

Uh oh!

Conversation

AnandK27 commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 29, 2024

Uh oh!

AnandK27 commented Aug 29, 2024

Uh oh!

AnandK27 commented Aug 29, 2024

Uh oh!

sayakpaul commented Aug 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[train_custom_diffusion.py] Fix the LR schedulers when `num_train_epochs` is passed in a distributed training env #9308

[train_custom_diffusion.py] Fix the LR schedulers when `num_train_epochs` is passed in a distributed training env #9308

AnandK27 commented Aug 29, 2024 •

edited

Loading