Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ALBERT Presets #655

Merged
merged 8 commits into from
Jan 18, 2023
Merged

Conversation

"Base size of ALBERT where all input is lowercased. "
"Trained on English Wikipedia + BooksCorpus."
),
"weights_url": "https://drive.google.com/uc?export=download&id=1RzTTa8nMcBc84nARvJmHal5SndpKbDUa",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to review, but this is awesome! Super great to model how to do this for contributors.

Copy link
Collaborator Author

@abheesht17 abheesht17 Jan 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw, there is a cap on the size of the file you can download from GDrive. For example, extra_large and extra_extra_large tests fail because their size > 200MB. I have observed the same for FNet. Instead of downloading the actual file, an HTML file is downloaded.

Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

keras_nlp/models/albert/albert_presets.py Show resolved Hide resolved
@abheesht17
Copy link
Collaborator Author

Please don't merge it just yet, have a few things to check.

@jbischof
Copy link
Contributor

@abheesht17 is this ready now?

@abheesht17
Copy link
Collaborator Author

@abheesht17 is this ready now?

@jbischof, it is! However, please check this comment: #655 (comment). Could you please upload the weights on GCP? I'll change the URLs and we can then merge this PR :)

@jbischof
Copy link
Contributor

Oh, are you saying we need to run to colabs ourselves @abheesht17 ?

@abheesht17
Copy link
Collaborator Author

abheesht17 commented Jan 15, 2023

Nope! I'm just saying that since Google Drive has a cap of 200 MB in order to be able to download files, the preset UTs are failing for extra large and extra extra large (since they are > 200 MB).

The URLs I have in the presets file...could you please upload these weights on the GCP bucket for models? I will then change the URLs to GCP URLs and run the preset tests.

@jbischof

Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Love how simple our preset adding PRs are getting.

Should we track adding the conversion script somewhere?

@mattdangerw mattdangerw merged commit 2f6e398 into keras-team:master Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants