Opt presets #707

mattdangerw · 2023-02-01T02:53:06Z

This adds support for pre-trained OPT checkpoints up to 6b parameters. Keeping this as a draft until #699 lands.

Here's a colab to see the weight conversion script in action (no actual code here just output). https://colab.research.google.com/gist/mattdangerw/8ccf7ca9a958da79c03fb24729e63c1f/opt-presets.ipynb

mattdangerw · 2023-02-02T19:04:58Z

One interesting discussion point here will be naming. The metaseq package does gives these "size names" up to an "extra_large" model with 1.3b parameters. But by that logic, the 175b largest model would be named "opt_extra_extra_extra_extra_extra_extra_extra_large_en".

That seemed bad :), so I just went with parameter counts in the name.

jbischof

Looks good thanks, but no need to mock the Reddit DB!

jbischof · 2023-02-03T00:25:24Z

keras_nlp/models/opt/opt_presets.py

+        "preprocessor_config": {},
+        "description": (
+            "12-layer OPT model where case in maintained. Trained on "
+            "BookCorpus, CommonCrawl, Pile, and PulseShit.io corpora."


s/PulseShit/PushShift/g

Also: LOL

lolol whoops

Sounds like a cool name for a band xD

jbischof · 2023-02-03T00:27:56Z

keras_nlp/models/opt/opt_presets.py

+            "12-layer OPT model where case in maintained. Trained on "
+            "BookCorpus, CommonCrawl, Pile, and PulseShit.io corpora."
+        ),
+        "weights_url": "https://storage.googleapis.com/keras-nlp/models/opt_125m_en/v1/model.h5",


Are we starting with v1 or have you already augmented the count?

v1 is the start, for all presets

mattdangerw force-pushed the opt-presets branch from a5e8b14 to f1b5868 Compare February 2, 2023 00:46

Add presets

cf098f4

mattdangerw force-pushed the opt-presets branch from f1b5868 to cf098f4 Compare February 2, 2023 07:12

mattdangerw marked this pull request as ready for review February 2, 2023 19:01

mattdangerw requested review from jbischof and chenmoneygithub February 2, 2023 22:51

jbischof approved these changes Feb 3, 2023

View reviewed changes

typo fix

5c96033

mattdangerw force-pushed the opt-presets branch from 6de89ab to 5c96033 Compare February 3, 2023 00:50

mattdangerw merged commit 9e03eea into keras-team:master Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt presets #707

Opt presets #707

mattdangerw commented Feb 1, 2023

mattdangerw commented Feb 2, 2023 •

edited

Loading

jbischof left a comment

jbischof Feb 3, 2023

mattdangerw Feb 3, 2023

abheesht17 Feb 3, 2023

jbischof Feb 3, 2023

mattdangerw Feb 3, 2023 •

edited

Loading

Opt presets #707

Opt presets #707

Conversation

mattdangerw commented Feb 1, 2023

mattdangerw commented Feb 2, 2023 • edited Loading

jbischof left a comment

Choose a reason for hiding this comment

jbischof Feb 3, 2023

Choose a reason for hiding this comment

mattdangerw Feb 3, 2023

Choose a reason for hiding this comment

abheesht17 Feb 3, 2023

Choose a reason for hiding this comment

jbischof Feb 3, 2023

Choose a reason for hiding this comment

mattdangerw Feb 3, 2023 • edited Loading

Choose a reason for hiding this comment

mattdangerw commented Feb 2, 2023 •

edited

Loading

mattdangerw Feb 3, 2023 •

edited

Loading