Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4042 Hugging Face Hub integration #6454

Merged
merged 34 commits into from
Oct 19, 2023

Conversation

katielink
Copy link
Contributor

@katielink katielink commented May 1, 2023

Fixes #4042 .

Description

  • Add functionality to download a model bundle from the Hugging Face Hub.
  • Add functionality to upload a model bundle to the Hugging Face Hub.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@wyli
Copy link
Contributor

wyli commented May 2, 2023

thanks for the PR! please signoff your commits...(follow the DCO checker's instructions https://github.com/Project-MONAI/MONAI/pull/6454/checks?check_run_id=13150779964)

@tangy5
Copy link
Contributor

tangy5 commented May 2, 2023

This is great, looking forward to seeing the bundles from Hugging Face hub use in MONAI/MONAILabel, etc..

@Nic-Ma Nic-Ma requested a review from yiheng-wang-nv May 4, 2023 08:31
@Nic-Ma
Copy link
Contributor

Nic-Ma commented May 4, 2023

Thanks for the contribution.
I have 1 open question here: Currently, MONAI model-zoo is our only development path for new bundles, when a new bundle is approved and merged, we can store it in NGC or Github storage. Now if we add a new storage source huggingface_hub, what will be the contribution or development path for these huggingface bundles? Do we still use model-zoo to develop and optionally store it in huggingface?
We have relatively strong review & CI check & testing for the bundles in the model-zoo repo to ensure the code quality, and also developing training and inference CI tests with fake input data.
CC @ericspod @wyli @yiheng-wang-nv for further discussion.

Thanks in advance.

@wyli
Copy link
Contributor

wyli commented May 4, 2023

Hi @Nic-Ma this is for community-supported bundles, so that the users have the freedom to create/share their models, it's not related to our model-zoo.

@ericspod
Copy link
Member

ericspod commented May 4, 2023

Thanks for the contribution. I have 1 open question here....

As @wyli says this is for community support so it's up to others to ensure the quality of their bundles and our model-zoo will continue to do its own thing.

@katielink
Copy link
Contributor Author

Thanks for the contribution. I have 1 open question here: Currently, MONAI model-zoo is our only development path for new bundles, when a new bundle is approved and merged, we can store it in NGC or Github storage. Now if we add a new storage source huggingface_hub, what will be the contribution or development path for these huggingface bundles? Do we still use model-zoo to develop and optionally store it in huggingface? We have relatively strong review & CI check & testing for the bundles in the model-zoo repo to ensure the code quality, and also developing training and inference CI tests with fake input data. CC @ericspod @wyli @yiheng-wang-nv for further discussion.

Thanks in advance.

Thanks for your question @Nic-Ma!
As @ericspod and @wyli have mentioned, this PR will support community users to upload or download MONAI model bundles to their personal HF accounts. Users, not MONAI, will be responsible for the quality of the bundles.

In the future, if there's interest, we can explore uploading approved bundles to the Hub under the official MONAI organization on HF, which I believe could be as simple as a few lines of code here. The submission/review process would remain the same; just a copy of the approved bundle would be pushed to HF at the same time as it's uploaded to Github. It would give users another point of accessibility for approved model-zoo models, similar to the way NVIDIA contributes models to both NGC and HF currently.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented May 4, 2023

Hi @katielink , @ericspod , @wyli ,

Thanks for the discussion, looks good to me.

Signed-off-by: katielink <[email protected]>

DCO Remediation Commit for katielink <[email protected]>

I, katielink <[email protected]>, hereby add my Signed-off-by to this commit: b7d462d
I, katielink <[email protected]>, hereby add my Signed-off-by to this commit: be3e678

Signed-off-by: katielink <[email protected]>
DCO Remediation Commit for katielink <[email protected]>

I, katielink <[email protected]>, hereby add my Signed-off-by to this commit: 44b09ac

Signed-off-by: katielink <[email protected]>
@wyli wyli mentioned this pull request Jun 23, 2023
Copy link

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, maintainer of huggingface_hub here 👋 I came across this PR (thanks for the ping @osanseviero :)) and made a quick technical review about the integration with the Hub. Looks good to me except a tiny attribute error (see below). Thanks @katielink !

monai/bundle/scripts.py Outdated Show resolved Hide resolved
monai/bundle/scripts.py Outdated Show resolved Hide resolved
monai/bundle/scripts.py Outdated Show resolved Hide resolved
monai/bundle/scripts.py Outdated Show resolved Hide resolved
@katielink
Copy link
Contributor Author

Hey @Wauplin thanks so much for the review, I'm still wrapping this WIP PR up on my end so I will incorporate these comments as I do so :)

@katielink
Copy link
Contributor Author

EDIT: Also to mention that for external libraries we have a framework to run end-to-end tests directly in huggingface_hub. Here is a README on how to integrate a new one. There are two advantages to this:

  • each time we release a new version of huggingface_hub, we make sure that we don't break an external integration
  • we take care of the staging/token stuff

The main drawback is that the tests are not run on each and every CI run on the library repo but only once a week. I can help setting up the tests if you are up for this solution! (we can even set mocked tests here and a real test in huggingface_hub/contrib - it just depends how much effort we want to put into these tests 😄

Super helpful, thanks @Wauplin! I'll look into it a bit more, but I like the idea of running mocks in MONAI and end-to-end tests in huggingface_hub/contrib. I'll DM you directly on slack if/when I have questions about this 🙂

@ericspod
Copy link
Member

we don't currently have any model uploading tests, I think using mocks is good, cc @ericspod @Nic-Ma thoughts?

(fyi the premerge tests are not triggered for this PR because there are merging conflicts)

Hi @katielink I think mocks are fine too. I looked at the test organisation as well and the bundles that are there look good to me. The file structure is correct except the README.md files appear duplicated in the root. This gets pulled out as the model card which is what we want, but could we point HF to use the one in docs instead somehow? If not it's a minor thing we can just live with by moving that file from docs rather than copy. Thanks!

@katielink
Copy link
Contributor Author

The file structure is correct except the README.md files appear duplicated in the root. This gets pulled out as the model card which is what we want, but could we point HF to use the one in docs instead somehow? If not it's a minor thing we can just live with by moving that file from docs rather than copy.

Hi @ericspod! Yes unfortunately to my knowledge, there's no way to specify the model card's location, so it must be in the root in order to be visible in the model card tab on HF. I'm currently copying this card so that I can add some HF-specific metadata to it automatically so it's easier for others to find (e.g. adding "monai" and "medical" as tags, adding the license if it's Apache 2.0 or MIT). I can delete this modified model card upon download to keep the original structure when the bundle is local, or if you'd prefer I can simply move and modify the original card. Let me know what you'd prefer!

@ericspod
Copy link
Member

Hi @ericspod! Yes unfortunately to my knowledge, there's no way to specify the model card's location, so it must be in the root in order to be visible in the model card tab on HF. I'm currently copying this card so that I can add some HF-specific metadata to it automatically so it's easier for others to find (e.g. adding "monai" and "medical" as tags, adding the license if it's Apache 2.0 or MIT). I can delete this modified model card upon download to keep the original structure when the bundle is local, or if you'd prefer I can simply move and modify the original card. Let me know what you'd prefer!

If you're changing the file to suit HF I would say we leave it and just accept it's duplicated, this seems to be the easiest approach. We can mention somewhere that this is done for HF compatibility. Thanks!

@wyli
Copy link
Contributor

wyli commented Oct 19, 2023

/build

Copy link
Contributor

@wyli wyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it looks good to me. all premerge tests work fine and I think we can merge it soon for more integration tests.

requirements-dev.txt Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
katielink and others added 2 commits October 19, 2023 09:22
Co-authored-by: Wenqi Li <[email protected]>
Signed-off-by: Katie Link <[email protected]>
Co-authored-by: Wenqi Li <[email protected]>
Signed-off-by: Katie Link <[email protected]>
@katielink
Copy link
Contributor Author

Thanks @wyli!

@wyli
Copy link
Contributor

wyli commented Oct 19, 2023

/build

@wyli wyli marked this pull request as ready for review October 19, 2023 14:27
@wyli wyli changed the title (WIP) 4042 Hugging Face Hub integration 4042 Hugging Face Hub integration Oct 19, 2023
@wyli wyli enabled auto-merge (squash) October 19, 2023 14:27
@YanxuanLiu
Copy link
Collaborator

/build

@wyli wyli disabled auto-merge October 19, 2023 14:44
@wyli
Copy link
Contributor

wyli commented Oct 19, 2023

/build

@wyli wyli enabled auto-merge (squash) October 19, 2023 15:16
@wyli
Copy link
Contributor

wyli commented Oct 19, 2023

/build

@wyli wyli disabled auto-merge October 19, 2023 16:46
@wyli wyli enabled auto-merge (squash) October 19, 2023 16:46
@wyli wyli merged commit 6e5fdc0 into Project-MONAI:dev Oct 19, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically download / upload models with hugging face, torch hub
7 participants