Closed
Description
Describe the bug
The learning rate scaling is applied twice in the Dreambooth Lora training script, both in line 749 and in line 759. I assume this is not intentional:
https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py#L749
if args.scale_lr:
args.learning_rate = (
args.learning_rate * args.gradient_accumulation_steps * args.train_batch_size * accelerator.num_processes
)
# Enable TF32 for faster training on Ampere GPUs,
# cf https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices
if args.allow_tf32:
torch.backends.cuda.matmul.allow_tf32 = True
if args.scale_lr:
args.learning_rate = (
args.learning_rate * args.gradient_accumulation_steps * args.train_batch_size * accelerator.num_processes
)
Reproduction
No reproduction required, can be checked by viewing code.
Logs
No response
System Info
latest commit 20e426c