-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ALBERT Presets #655
Add ALBERT Presets #655
Conversation
"Base size of ALBERT where all input is lowercased. " | ||
"Trained on English Wikipedia + BooksCorpus." | ||
), | ||
"weights_url": "https://drive.google.com/uc?export=download&id=1RzTTa8nMcBc84nARvJmHal5SndpKbDUa", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to review, but this is awesome! Super great to model how to do this for contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw, there is a cap on the size of the file you can download from GDrive. For example, extra_large
and extra_extra_large
tests fail because their size > 200MB. I have observed the same for FNet. Instead of downloading the actual file, an HTML file is downloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
Please don't merge it just yet, have a few things to check. |
@abheesht17 is this ready now? |
@jbischof, it is! However, please check this comment: #655 (comment). Could you please upload the weights on GCP? I'll change the URLs and we can then merge this PR :) |
Oh, are you saying we need to run to colabs ourselves @abheesht17 ? |
Nope! I'm just saying that since Google Drive has a cap of 200 MB in order to be able to download files, the preset UTs are failing for extra large and extra extra large (since they are > 200 MB). The URLs I have in the presets file...could you please upload these weights on the GCP bucket for models? I will then change the URLs to GCP URLs and run the preset tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Love how simple our preset adding PRs are getting.
Should we track adding the conversion script somewhere?
albert_base_en_uncased
: https://colab.research.google.com/drive/1JkGOSQh5cg7u7Y2K503Zmv2EjsYN9UF3?usp=sharingalbert_large_en_uncased
: https://colab.research.google.com/drive/1dKH0xEYRnzs7W2zV2a4L9_jBIfNhzAXT?usp=sharingalbert_extra_large_en_uncased
: https://colab.research.google.com/drive/1ZrG4tCRjG6MDIraPcX8JyRZZEKU6xtfG?usp=sharingalbert_extra_extra_large_en_uncased
: https://colab.research.google.com/drive/1psMrqFXAVi19RGUoBCRhl8umOaQslwqb?usp=sharing