-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Diffusion Policy for Reinforcement Learning #9824
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
It seems like we're putting in the content of some repository within |
@sayakpaul Thank you for the feedback! I have made the changes. Now, it includes only an inference example of using diffusers for diffusion policy |
return action.transpose(1, 2) # [batch_size, sequence_length, action_dim] | ||
|
||
if __name__ == "__main__": | ||
policy = DiffusionPolicy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we load any pre-trained model here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the valuable thought!
Diffusion policies are frequently tailored to specific use cases, and incorporating pretrained weights into the inference example could highly limit its general applicability and confuse users working on different tasks. Although I have pretrained weights available for a specific task that I can add here, to maintain the example’s universality, I recommend initializing the model without loading them. This will allow users to train their own models or integrate relevant pretrained weights based on their own applications!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I beg to differ. I think if we can document it sufficiently it would make more sense to showcase this with a pre-trained model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! I have made the changes. Now, the example loads from a pretrained model and contains comprehensive documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some further comments.
from diffusers import DDPMScheduler, UNet1DModel | ||
|
||
|
||
add_safe_globals( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After setting weights_only=True (from False), an error occurs for any pretrained model unless we use add_safe_globals to make custom or third-party methods available (since weights_only=True skips the configuration loading). I believe it is a preventative measure by HuggingFace for security reasons, because it explicitly states that if we use weights_only=False, we must trust the authors of the model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is happening at torch.load()
so, I don't think it has anything to do with Hugging Face. Which torch version are you using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thank you - I am using 2.5.1
It appears the 1 failing check is unrelated to the changes in this PR and may be due to external factors. Do they need to be addressed? |
Indeed. That is not merge-blocking. Thanks for the PR! |
* enable cpu ability * model creation + comprehensive testing * training + tests * all tests working * remove unneeded files + clarify docs * update train tests * update readme.md * remove data from gitignore * undo cpu enabled option * Update README.md * update readme * code quality fixes * diffusion policy example * update readme * add pretrained model weights + doc * add comment * add documentation * add docstrings * update comments * update readme * fix code quality * Update examples/reinforcement_learning/README.md Co-authored-by: Sayak Paul <[email protected]> * Update examples/reinforcement_learning/diffusion_policy.py Co-authored-by: Sayak Paul <[email protected]> * suggestions + safe globals for weights_only=True * suggestions + safe weights loading * fix code quality * reformat file --------- Co-authored-by: Sayak Paul <[email protected]>
What does this PR do?
Adds Diffusion Policy, a diffusion model to predict action sequences in reinforcement learning tasks, using the HuggingFace
diffusers
library.Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sayakpaul @yiyixuxu @DN6 @a-r-r-o-w