Add Diffusion Policy for Reinforcement Learning #9824

DorsaRoh · 2024-10-31T21:15:48Z

What does this PR do?

Adds Diffusion Policy, a diffusion model to predict action sequences in reinforcement learning tasks, using the HuggingFace diffusers library.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sayakpaul @yiyixuxu @DN6 @a-r-r-o-w

into diffusion-policy

HuggingFaceDocBuilderDev · 2024-11-01T01:05:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

into diffusion-policy

sayakpaul · 2024-11-01T03:11:46Z

It seems like we're putting in the content of some repository within diffusers. I think it might be better off as a standalone repository than an example here as we're not a training focused library.

DorsaRoh · 2024-11-01T04:42:40Z

It seems like we're putting in the content of some repository within diffusers. I think it might be better off as a standalone repository than an example here as we're not a training focused library.

@sayakpaul Thank you for the feedback! I have made the changes. Now, it includes only an inference example of using diffusers for diffusion policy

sayakpaul · 2024-11-01T04:47:20Z

examples/reinforcement_learning/diffusion_policy.py

+        return action.transpose(1, 2)  # [batch_size, sequence_length, action_dim]
+
+if __name__ == "__main__":
+    policy = DiffusionPolicy()


Should we load any pre-trained model here?

Thanks for the valuable thought!
Diffusion policies are frequently tailored to specific use cases, and incorporating pretrained weights into the inference example could highly limit its general applicability and confuse users working on different tasks. Although I have pretrained weights available for a specific task that I can add here, to maintain the example’s universality, I recommend initializing the model without loading them. This will allow users to train their own models or integrate relevant pretrained weights based on their own applications!

I beg to differ. I think if we can document it sufficiently it would make more sense to showcase this with a pre-trained model.

Sounds good! I have made the changes. Now, the example loads from a pretrained model and contains comprehensive documentation

into diffusion-policy

sayakpaul

Thanks! Left some further comments.

examples/reinforcement_learning/README.md

examples/reinforcement_learning/diffusion_policy.py

Co-authored-by: Sayak Paul <[email protected]>

into diffusion-policy

sayakpaul · 2024-11-02T02:02:33Z

examples/reinforcement_learning/diffusion_policy.py

+from diffusers import DDPMScheduler, UNet1DModel
+
+
+add_safe_globals(


Why do we need it?

After setting weights_only=True (from False), an error occurs for any pretrained model unless we use add_safe_globals to make custom or third-party methods available (since weights_only=True skips the configuration loading). I believe it is a preventative measure by HuggingFace for security reasons, because it explicitly states that if we use weights_only=False, we must trust the authors of the model

It is happening at torch.load() so, I don't think it has anything to do with Hugging Face. Which torch version are you using?

I see, thank you - I am using 2.5.1

DorsaRoh · 2024-11-02T02:33:05Z

It appears the 1 failing check is unrelated to the changes in this PR and may be due to external factors. Do they need to be addressed?

sayakpaul · 2024-11-02T03:48:42Z

Indeed. That is not merge-blocking. Thanks for the PR!

* enable cpu ability * model creation + comprehensive testing * training + tests * all tests working * remove unneeded files + clarify docs * update train tests * update readme.md * remove data from gitignore * undo cpu enabled option * Update README.md * update readme * code quality fixes * diffusion policy example * update readme * add pretrained model weights + doc * add comment * add documentation * add docstrings * update comments * update readme * fix code quality * Update examples/reinforcement_learning/README.md Co-authored-by: Sayak Paul <[email protected]> * Update examples/reinforcement_learning/diffusion_policy.py Co-authored-by: Sayak Paul <[email protected]> * suggestions + safe globals for weights_only=True * suggestions + safe weights loading * fix code quality * reformat file --------- Co-authored-by: Sayak Paul <[email protected]>

DorsaRoh added 14 commits October 31, 2024 10:29

enable cpu ability

654f6b4

model creation + comprehensive testing

96c62d0

training + tests

8759c12

all tests working

2cbc3bb

remove unneeded files + clarify docs

5feb7af

Merge branch 'diffusion-policy' of https://github.com/DorsaRoh/diffusers

80fcab8

into diffusion-policy

update train tests

a4c7340

update readme.md

6457aec

remove data from gitignore

ddb7718

undo cpu enabled option

089c40a

Update README.md

7d96254

update readme

25e4638

Merge branch 'diffusion-policy' of https://github.com/DorsaRoh/diffusers

566f112

into diffusion-policy

Merge branch 'main' into diffusion-policy

fbb442c

DorsaRoh added 2 commits October 31, 2024 21:38

code quality fixes

4c07cea

Merge branch 'diffusion-policy' of https://github.com/DorsaRoh/diffusers

c0e1af3

into diffusion-policy

diffusion policy example

957de92

update readme

09922df

sayakpaul reviewed Nov 1, 2024

View reviewed changes

DorsaRoh added 8 commits November 1, 2024 09:19

add pretrained model weights + doc

186b8f0

add comment

d7ced53

Merge branch 'main' into diffusion-policy

333b5cb

add documentation

b37b0e2

Merge branch 'diffusion-policy' of https://github.com/DorsaRoh/diffusers

65dd0b7

into diffusion-policy

add docstrings

06401cb

update comments

2bf4fa0

update readme

47f2ba5

DorsaRoh requested a review from sayakpaul November 1, 2024 14:56

sayakpaul reviewed Nov 1, 2024

View reviewed changes

examples/reinforcement_learning/README.md Outdated Show resolved Hide resolved

examples/reinforcement_learning/diffusion_policy.py Outdated Show resolved Hide resolved

examples/reinforcement_learning/diffusion_policy.py Outdated Show resolved Hide resolved

DorsaRoh and others added 6 commits November 1, 2024 11:00

fix code quality

be98e76

Update examples/reinforcement_learning/README.md

5bff039

Co-authored-by: Sayak Paul <[email protected]>

Update examples/reinforcement_learning/diffusion_policy.py

2bad2fc

Co-authored-by: Sayak Paul <[email protected]>

suggestions + safe globals for weights_only=True

a7b8ef2

Merge branch 'diffusion-policy' of https://github.com/DorsaRoh/diffusers

9a5cfe7

into diffusion-policy

suggestions + safe weights loading

19db70e

DorsaRoh requested a review from sayakpaul November 1, 2024 15:15

DorsaRoh added 2 commits November 1, 2024 11:18

fix code quality

c8fa61a

reformat file

b8fe110

sayakpaul reviewed Nov 2, 2024

View reviewed changes

sayakpaul approved these changes Nov 2, 2024

View reviewed changes

DorsaRoh requested a review from sayakpaul November 2, 2024 02:31

sayakpaul merged commit c10f875 into huggingface:main Nov 2, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Diffusion Policy for Reinforcement Learning #9824

Add Diffusion Policy for Reinforcement Learning #9824

DorsaRoh commented Oct 31, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 1, 2024

sayakpaul commented Nov 1, 2024

DorsaRoh commented Nov 1, 2024 •

edited

Loading

sayakpaul Nov 1, 2024

DorsaRoh Nov 1, 2024 •

edited

Loading

sayakpaul Nov 1, 2024

DorsaRoh Nov 1, 2024 •

edited

Loading

sayakpaul left a comment

sayakpaul Nov 2, 2024

DorsaRoh Nov 2, 2024

sayakpaul Nov 2, 2024

DorsaRoh Nov 2, 2024

DorsaRoh commented Nov 2, 2024 •

edited

Loading

sayakpaul commented Nov 2, 2024

		from diffusers import DDPMScheduler, UNet1DModel


		add_safe_globals(

Add Diffusion Policy for Reinforcement Learning #9824

Add Diffusion Policy for Reinforcement Learning #9824

Conversation

DorsaRoh commented Oct 31, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Nov 1, 2024

sayakpaul commented Nov 1, 2024

DorsaRoh commented Nov 1, 2024 • edited Loading

sayakpaul Nov 1, 2024

Choose a reason for hiding this comment

DorsaRoh Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

sayakpaul Nov 1, 2024

Choose a reason for hiding this comment

DorsaRoh Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul Nov 2, 2024

Choose a reason for hiding this comment

DorsaRoh Nov 2, 2024

Choose a reason for hiding this comment

sayakpaul Nov 2, 2024

Choose a reason for hiding this comment

DorsaRoh Nov 2, 2024

Choose a reason for hiding this comment

DorsaRoh commented Nov 2, 2024 • edited Loading

sayakpaul commented Nov 2, 2024

DorsaRoh commented Oct 31, 2024 •

edited

Loading

DorsaRoh commented Nov 1, 2024 •

edited

Loading

DorsaRoh Nov 1, 2024 •

edited

Loading

DorsaRoh Nov 1, 2024 •

edited

Loading

DorsaRoh commented Nov 2, 2024 •

edited

Loading