BertMaskedLM Task Model and Preprocessor #774

Cyber-Machine · 2023-02-23T19:00:37Z

Fixes #719
I have made the following changes, but I am still, working on the process:

Update BertTokenizer to expect a mask token.
Add a BertMaskedLMPreprocessor preprocessor layer and tests.
Add a BertMaskedLM task model and tests.
Update keras_nlp/models/__init__.py to export BertMaskedLM and BertMaskedLMPreprocessor.

Cyber-Machine · 2023-02-24T19:55:14Z

@mattdangerw This PR is ready for review.
Here's the gist of the working model.

…-nlp into BertMaskedLM

shivance · 2023-02-24T22:54:17Z

@Cyber-Machine use black to format your code, before making a commit, to format your code.

keras_nlp/models/bert/bert_masked_lm_test.py

keras_nlp/models/bert/bert_masked_lm.py

keras_nlp/layers/masked_lm_mask_generator.py

…nto BertMaskedLM

…-nlp into BertMaskedLM

Cyber-Machine · 2023-02-28T04:36:00Z

@mattdangerw @abheesht17 This PR is ready for review.

… into BertMaskedLM

mattdangerw

Thank you! This looks great to me. Just some minor comments.

mattdangerw · 2023-03-03T04:04:59Z

keras_nlp/models/bert/bert_masked_lm_preprocessor.py

+@keras.utils.register_keras_serializable(package="keras_nlp")
+class BertMaskedLMPreprocessor(BertPreprocessor):
+    """BERT preprocessing for the masked language modeling task.
+    This preprocessing layer will prepare inputs for a masked language modeling


Looks like the empty newlines from the version you copied from got removed. (github does this for some reason)

Can you add them back in throughout this docstring?

mattdangerw · 2023-03-03T04:06:59Z

keras_nlp/models/bert/bert_masked_lm_preprocessor_test.py

+        self.assertAllEqual(
+            x["padding_mask"],
+            [
+                True,


If possible, shorted these examples so we don't take up so much vertical space here.

You should be able to pass 0s and 1s here which should help, we do that for other tests.

abheesht17

Couple of NITs

abheesht17 · 2023-03-03T07:14:39Z

keras_nlp/models/bert/bert_masked_lm.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""BERT masked lm model."""


"lm" --> "LM"

abheesht17 · 2023-03-03T12:43:40Z

keras_nlp/models/bert/bert_masked_lm.py

+        intermediate_dim=3072,
+        max_sequence_length=12
+    )
+    # Create a BERT masked_lm and fit the data.


"Crete a BERT masked LM model and fit the data."

mattdangerw · 2023-03-03T20:27:24Z

Thank you! This is great!

Cyber-Machine added 3 commits February 23, 2023 07:55

bert_masekd_lm init

45ac327

Merge branch 'master' into BertMaskedLM

d9f4fb1

WIP : BERT MASKED LM

aa91c13

Cyber-Machine marked this pull request as draft February 23, 2023 19:00

Cyber-Machine and others added 3 commits February 24, 2023 23:14

Added Tests

c076ce9

Black Formatting

1beeb98

Merge branch 'master' into BertMaskedLM

88acb6d

Cyber-Machine marked this pull request as ready for review February 24, 2023 19:55

Cyber-Machine changed the title ~~WIP: BertMaskedLM Task Model and Preprocessor~~ BertMaskedLM Task Model and Preprocessor Feb 24, 2023

Cyber-Machine added 2 commits February 25, 2023 02:19

Fixed Format

e61eb42

Merge branch 'BertMaskedLM' of https://github.com/Cyber-Machine/keras…

5de1edd

…-nlp into BertMaskedLM

Fixed formatting

13a2eb4

shivance suggested changes Feb 25, 2023

View reviewed changes

keras_nlp/models/bert/bert_masked_lm_test.py Show resolved Hide resolved

keras_nlp/models/bert/bert_masked_lm.py Outdated Show resolved Hide resolved

keras_nlp/layers/masked_lm_mask_generator.py Outdated Show resolved Hide resolved

Cyber-Machine and others added 8 commits February 25, 2023 23:32

black + lint.sh

1a12b69

Reformat codew

3a1eb8a

Merge branch 'master' of https://github.com/Cyber-Machine/keras-nlp i…

7103acc

…nto BertMaskedLM

Merge branch 'keras-team:master' into BertMaskedLM

fcbbfd3

Merge branch 'BertMaskedLM' of https://github.com/Cyber-Machine/keras…

a402866

…-nlp into BertMaskedLM

Updated Docstring for bert_tokenizer

25512ab

Updated masked_lm_generator.py

26e7beb

fixed linting

90e363d

Merge branch 'model_card' of https://github.com/Cyber-Machine/keras-nlp…

9d8fbc5

… into BertMaskedLM

mattdangerw approved these changes Mar 3, 2023

View reviewed changes

Changed Boolean Variables tp Numeric

17e0469

Cyber-Machine requested a review from mattdangerw March 3, 2023 07:14

Formatted using shell/format.sh

6af1e1e

abheesht17 reviewed Mar 3, 2023

View reviewed changes

Cyber-Machine and others added 2 commits March 3, 2023 18:55

Updated bert_masked_lm.py

aac3d05

typo fix

134e14b

mattdangerw merged commit 91fe6bd into keras-team:master Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BertMaskedLM Task Model and Preprocessor #774

BertMaskedLM Task Model and Preprocessor #774

Cyber-Machine commented Feb 23, 2023 •

edited

Loading

Cyber-Machine commented Feb 24, 2023

shivance commented Feb 24, 2023

Cyber-Machine commented Feb 28, 2023

mattdangerw left a comment

mattdangerw Mar 3, 2023

mattdangerw Mar 3, 2023

abheesht17 left a comment

abheesht17 Mar 3, 2023

abheesht17 Mar 3, 2023

mattdangerw commented Mar 3, 2023

BertMaskedLM Task Model and Preprocessor #774

BertMaskedLM Task Model and Preprocessor #774

Conversation

Cyber-Machine commented Feb 23, 2023 • edited Loading

Cyber-Machine commented Feb 24, 2023

shivance commented Feb 24, 2023

Cyber-Machine commented Feb 28, 2023

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Mar 3, 2023

Choose a reason for hiding this comment

mattdangerw Mar 3, 2023

Choose a reason for hiding this comment

abheesht17 left a comment

Choose a reason for hiding this comment

abheesht17 Mar 3, 2023

Choose a reason for hiding this comment

abheesht17 Mar 3, 2023

Choose a reason for hiding this comment

mattdangerw commented Mar 3, 2023

Cyber-Machine commented Feb 23, 2023 •

edited

Loading