Fix the issue on sd3 dreambooth w./w.t. lora training #9419

leisuzz · 2024-09-12T03:04:53Z

What does this PR do?

Fixes #9237 and few more potential issues with the RuntimeError: Input type and bias type in the log validation part in the sd3 dreambooth portion with lora, lora sdxl, lora sd3, lux, and sd3

I modified the code without changing or fixing the VAE type, so any weight dtype can be implemented without affecting the result

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

leisuzz · 2024-09-14T00:51:00Z

@sayakpaul Please proceed for reviewing this Pull Request

sayakpaul · 2024-09-14T07:59:26Z

Thanks!

Could you run the pytest examples on your GPU and report back?

leisuzz · 2024-09-14T08:05:54Z

Thanks!

Could you run the pytest examples on your GPU and report back?

I had the same issue in /examples/dreambooth/train_dreambooth_sd3.py, after this modification, the issue has been fixed. I will run on the Lora to see whether it works or not

leisuzz · 2024-09-14T08:52:14Z

Thanks!

Could you run the pytest examples on your GPU and report back?

Both w.t. / w. Lora works well.

`export model_name ="stable-diffusion-3-medium-diffusers"
export dataset_name ="pokemon-blip-captions"
export OUTPUT_DIR="sd3-dreambooth-lora"

accelerate launch train_dreambooth_lora_sd3.py
--pretrained_model_name_or_path=$model_name
--dataset_name=$dataset_name
--output_dir=$output_path
--instance_prompt="a photo of pokemon"
--resolution=$resolution
--train_batch_size=$batch_size
--gradient_checkpointing
--mixed_precision=$mixed_precision
--gradient_accumulation_steps=$gradient_accumulation_steps
--learning_rate=1e-06
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=$max_train_steps
--checkpointing_steps=5001
--validation_prompt="An robotic pokemon with wings"
--validation_epochs=100
--seed="0"
--output_dir=${output_path} > ${output_path}/train_${mixed_precision}_lora_sd3_dreambooth.log 2>&1 &
wait

`export model_name ="stable-diffusion-3-medium-diffusers"
export dataset_name ="pokemon-blip-captions"
export OUTPUT_DIR="sd3-dreambooth"

accelerate launch train_dreambooth_sd3.py
--pretrained_model_name_or_path=$model_name
--dataset_name=$dataset_name --caption_column="text"
--instance_prompt="a photo of pokemon"
--resolution=$resolution
--train_batch_size=$batch_size
--gradient_checkpointing
--mixed_precision=$mixed_precision
--gradient_accumulation_steps=1
--learning_rate=1e-06
--lr_scheduler="constant"
--lr_warmup_steps=0
--validation_prompt="A photo of pokemon in a bucket"
--validation_epochs=5
--max_train_steps=$max_train_steps
--checkpointing_steps=500
--output_dir=${output_path} > ${output_path}/train_${mixed_precision}_sd3_dreambooth.log 2>&1 &
wait

sayakpaul · 2024-09-14T10:59:28Z

Just ran the tests myself from your fork and all good! Thanks!

* Fix dtype error * [bugfix] Fixed the issue on sd3 dreambooth training * [bugfix] Fixed the issue on sd3 dreambooth training --------- Co-authored-by: 蒋硕 <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

…ence (#9434) * add ostris trainer to README & add cache latents of vae * add ostris trainer to README & add cache latents of vae * style * readme * add test for latent caching * add ostris noise scheduler https://github.com/ostris/ai-toolkit/blob/9ee1ef2a0a2a9a02b92d114a95f21312e5906e54/toolkit/samplers/custom_flowmatch_sampler.py#L95 * style * fix import * style * fix tests * style * --change upcasting of transformer? * update readme according to main * add pivotal tuning for CLIP * fix imports, encode_prompt call,add TextualInversionLoaderMixin to FluxPipeline for inference * TextualInversionLoaderMixin support for FluxPipeline for inference * move changes to advanced flux script, revert canonical * add latent caching to canonical script * revert changes to canonical script to keep it separate from #9160 * revert changes to canonical script to keep it separate from #9160 * style * remove redundant line and change code block placement to align with logic * add initializer_token arg * add transformer frac for range support from pure textual inversion to the orig pivotal tuning * support pure textual inversion - wip * adjustments to support pure textual inversion and transformer optimization in only part of the epochs * fix logic when using initializer token * fix pure_textual_inversion_condition * fix ti/pivotal loading of last validation run * remove embeddings loading for ti in final training run (to avoid adding huggingface hub dependency) * support pivotal for t5 * adapt pivotal for T5 encoder * adapt pivotal for T5 encoder and support in flux pipeline * t5 pivotal support + support fo pivotal for clip only or both * fix param chaining * fix param chaining * README first draft * readme * readme * readme * style * fix import * style * add fix from #9419 * add to readme, change function names * te lr changes * readme * change concept tokens logic * fix indices * change arg name * style * dummy test * revert dummy test * reorder pivoting * add warning in case the token abstraction is not the instance prompt * experimental - wip - specific block training * fix documentation and token abstraction processing * remove transformer block specification feature (for now) * style * fix copies * fix indexing issue when --initializer_concept has different amounts * add if TextualInversionLoaderMixin to all flux pipelines * style * fix import * fix imports * address review comments - remove necessary prints & comments, use pin_memory=True, use free_memory utils, unify warning and prints * style * logger info fix * make lora target modules configurable and change the default * make lora target modules configurable and change the default * style * make lora target modules configurable and change the default, add notes to readme * style * add tests * style * fix repo id * add updated requirements for advanced flux * fix indices of t5 pivotal tuning embeddings * fix path in test * remove `pin_memory` * fix filename of embedding * fix filename of embedding --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

* Fix dtype error * [bugfix] Fixed the issue on sd3 dreambooth training * [bugfix] Fixed the issue on sd3 dreambooth training --------- Co-authored-by: 蒋硕 <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

…ence (#9434) * add ostris trainer to README & add cache latents of vae * add ostris trainer to README & add cache latents of vae * style * readme * add test for latent caching * add ostris noise scheduler https://github.com/ostris/ai-toolkit/blob/9ee1ef2a0a2a9a02b92d114a95f21312e5906e54/toolkit/samplers/custom_flowmatch_sampler.py#L95 * style * fix import * style * fix tests * style * --change upcasting of transformer? * update readme according to main * add pivotal tuning for CLIP * fix imports, encode_prompt call,add TextualInversionLoaderMixin to FluxPipeline for inference * TextualInversionLoaderMixin support for FluxPipeline for inference * move changes to advanced flux script, revert canonical * add latent caching to canonical script * revert changes to canonical script to keep it separate from #9160 * revert changes to canonical script to keep it separate from #9160 * style * remove redundant line and change code block placement to align with logic * add initializer_token arg * add transformer frac for range support from pure textual inversion to the orig pivotal tuning * support pure textual inversion - wip * adjustments to support pure textual inversion and transformer optimization in only part of the epochs * fix logic when using initializer token * fix pure_textual_inversion_condition * fix ti/pivotal loading of last validation run * remove embeddings loading for ti in final training run (to avoid adding huggingface hub dependency) * support pivotal for t5 * adapt pivotal for T5 encoder * adapt pivotal for T5 encoder and support in flux pipeline * t5 pivotal support + support fo pivotal for clip only or both * fix param chaining * fix param chaining * README first draft * readme * readme * readme * style * fix import * style * add fix from #9419 * add to readme, change function names * te lr changes * readme * change concept tokens logic * fix indices * change arg name * style * dummy test * revert dummy test * reorder pivoting * add warning in case the token abstraction is not the instance prompt * experimental - wip - specific block training * fix documentation and token abstraction processing * remove transformer block specification feature (for now) * style * fix copies * fix indexing issue when --initializer_concept has different amounts * add if TextualInversionLoaderMixin to all flux pipelines * style * fix import * fix imports * address review comments - remove necessary prints & comments, use pin_memory=True, use free_memory utils, unify warning and prints * style * logger info fix * make lora target modules configurable and change the default * make lora target modules configurable and change the default * style * make lora target modules configurable and change the default, add notes to readme * style * add tests * style * fix repo id * add updated requirements for advanced flux * fix indices of t5 pivotal tuning embeddings * fix path in test * remove `pin_memory` * fix filename of embedding * fix filename of embedding --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

蒋硕 and others added 5 commits August 19, 2024 20:12

Fix dtype error

ea3b2d1

Merge branch 'main' into main

3b5bf98

Merge branch 'huggingface:main' into main

767c834

[bugfix] Fixed the issue on sd3 dreambooth training

449aded

Merge branch 'main' into main

3142d26

leisuzz closed this Sep 12, 2024

leisuzz reopened this Sep 12, 2024

蒋硕 and others added 4 commits September 12, 2024 13:11

[bugfix] Fixed the issue on sd3 dreambooth training

5d6f5a9

Merge branch 'main' into main

f7b7396

Merge branch 'main' into main

b3c2060

Merge branch 'main' into main

f2d29e3

sayakpaul merged commit e2ead7c into huggingface:main Sep 14, 2024
8 checks passed

linoytsaban added a commit to linoytsaban/diffusers that referenced this pull request Sep 30, 2024

add fix from huggingface#9419

99b7521

linoytsaban mentioned this pull request Sep 30, 2024

train_dreambooth_lora_flux validation RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same #9476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the issue on sd3 dreambooth w./w.t. lora training #9419

Fix the issue on sd3 dreambooth w./w.t. lora training #9419

leisuzz commented Sep 12, 2024 •

edited

Loading

leisuzz commented Sep 14, 2024

sayakpaul commented Sep 14, 2024

leisuzz commented Sep 14, 2024

leisuzz commented Sep 14, 2024

sayakpaul commented Sep 14, 2024

Fix the issue on sd3 dreambooth w./w.t. lora training #9419

Fix the issue on sd3 dreambooth w./w.t. lora training #9419

Conversation

leisuzz commented Sep 12, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

leisuzz commented Sep 14, 2024

sayakpaul commented Sep 14, 2024

leisuzz commented Sep 14, 2024

leisuzz commented Sep 14, 2024

sayakpaul commented Sep 14, 2024

leisuzz commented Sep 12, 2024 •

edited

Loading