Add support of `kwargs` to `Backbone.from_preset` and fix the dtype forwarding in `Task.from_preset` #1742

james77777778 · 2024-08-07T02:14:08Z

Currently, CausalLM.from_preset(..., dtype="bfloat16") has no effect because it doesn't forward dtype to the backbone.
This PR fixes that issue and also adds dtype support to Backbone.from_preset.

Additionally, I have updated the logic in load_serialized_object to support dtype when using DTypePolicyMap, ensuring that the pre-quantized preset will obey dtype:

import keras_nlp

llama_lm = keras_nlp.models.CausalLM.from_preset(
    "llama2_instruct_7b_en_int8", load_weights=False, dtype="bfloat16"
)
assert llama_lm.backbone.token_embedding.compute_dtype == "bfloat16"

mattdangerw

Looks good i think! just one small comment

mattdangerw · 2024-08-07T18:25:26Z

keras_nlp/src/models/backbone.py

-        backbone = load_serialized_object(preset, CONFIG_FILE)
+        # Forward `config_overrides` and `dtype`.
+        config_overrides = {}
+        if "config_overrides" in kwargs:


I believe the kwargs were supposed to ask as the config override directly (though it look like this broke at some point).

So if you wanted to set bert dropout, for example, you could do

model = keras_nlp.models.BertBackbone.from_preset("bert_base_en_uncased", dropout=0.5)

Yeah, it is a bit weird to see config_overrides and I think there is a dangerous default value (config_overrides={}) for it in load_serialized_object.

I can try to fix it.

…n Task.from_preset

james77777778 · 2024-08-08T06:00:57Z

@mattdangerw

I have updated Backbone.from_preset to support kwargs forwarding.
Now, this example works:

import keras_nlp


llama_lm = keras_nlp.models.CausalLM.from_preset(
    "llama2_instruct_7b_en_int8", load_weights=False, dtype="bfloat16"
)
assert llama_lm.backbone.token_embedding.compute_dtype == "bfloat16"

bert_backbone = keras_nlp.models.BertBackbone.from_preset(
    "bert_base_en_uncased", load_weights=False, dtype="bfloat16", dropout=0.5
)
assert bert_backbone.token_embedding.compute_dtype == "bfloat16"
assert bert_backbone.dropout == 0.5

mattdangerw

Thanks! This LGTM

…n Task.from_preset (#1742)

james77777778 changed the title ~~Add dtype argument to Backbone.from_preset and fix the dtype forwarding in CausalLM.from_preset~~ Add dtype argument to Backbone.from_preset and fix dtype forwarding in CausalLM.from_preset Aug 7, 2024

mattdangerw reviewed Aug 7, 2024

View reviewed changes

james77777778 force-pushed the fix-dtype branch from 23e761e to 3e56494 Compare August 8, 2024 05:43

james77777778 changed the title ~~Add dtype argument to Backbone.from_preset and fix dtype forwarding in CausalLM.from_preset~~ Add support of kwargs to Backbone.from_preset and fix the dtype forwarding in Task.from_preset Aug 8, 2024

Support kwargs to Backbone.from_preset and fix the dtype forwarding i…

562b9dd

…n Task.from_preset

james77777778 force-pushed the fix-dtype branch from 3e56494 to 562b9dd Compare August 8, 2024 05:56

james77777778 requested a review from mattdangerw August 8, 2024 09:05

mattdangerw approved these changes Aug 9, 2024

View reviewed changes

mattdangerw merged commit f5676ee into keras-team:master Aug 9, 2024
7 checks passed

james77777778 deleted the fix-dtype branch August 10, 2024 08:19

divyashreepathihalli pushed a commit that referenced this pull request Aug 12, 2024

Support kwargs to Backbone.from_preset and fix the dtype forwarding i…

d8b2656

…n Task.from_preset (#1742)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support of `kwargs` to `Backbone.from_preset` and fix the dtype forwarding in `Task.from_preset` #1742

Add support of `kwargs` to `Backbone.from_preset` and fix the dtype forwarding in `Task.from_preset` #1742

james77777778 commented Aug 7, 2024 •

edited

Loading

mattdangerw left a comment

mattdangerw Aug 7, 2024

james77777778 Aug 8, 2024

james77777778 commented Aug 8, 2024 •

edited

Loading

mattdangerw left a comment

Add support of kwargs to Backbone.from_preset and fix the dtype forwarding in Task.from_preset #1742

Add support of kwargs to Backbone.from_preset and fix the dtype forwarding in Task.from_preset #1742

Conversation

james77777778 commented Aug 7, 2024 • edited Loading

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Aug 7, 2024

Choose a reason for hiding this comment

james77777778 Aug 8, 2024

Choose a reason for hiding this comment

james77777778 commented Aug 8, 2024 • edited Loading

mattdangerw left a comment

Choose a reason for hiding this comment

Add support of `kwargs` to `Backbone.from_preset` and fix the dtype forwarding in `Task.from_preset` #1742

Add support of `kwargs` to `Backbone.from_preset` and fix the dtype forwarding in `Task.from_preset` #1742

james77777778 commented Aug 7, 2024 •

edited

Loading

james77777778 commented Aug 8, 2024 •

edited

Loading