Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELECTRA backbone implementation in keras #1291

Merged
merged 5 commits into from
Dec 4, 2023

Conversation

pranavvp16
Copy link
Contributor

@pranavvp16 pranavvp16 commented Oct 31, 2023

I almost have successfully implemented the electra_backbone in keras, but there are 2 problems remaining to solve

1 - When test_backbone_basics is run it throws error ReversibleEmbbeding is not json serialisable but at the same time tests from BERT pass even though it also implements the ReversibleEmbedding class.

2- The second problem is although outputs of hf_model and keras_model match, I'm unable to save the keras model successfully.notebook

please can someone guide me to fix these above problems , #1281

@mattdangerw
Copy link
Member

/gcbrun

@mattdangerw
Copy link
Member

One minor point, make sure to format your code with the format script. ./shell/format.sh

@mattdangerw
Copy link
Member

Thanks for the pr! Not sure what is going on with serialization on tests either, but will try to find some time to debug soon.

@mattdangerw
Copy link
Member

Ah figured this out. You are seeing failures in the serialization tests, which basically check that cls.from_config(instance.get_config()) works.

You need to make sure all the parameters you return from get_config map 1-1 with arguments the backbone takes in during init. You have some name mismatches, and things like cls_token_index and token_embedding added to the config which should not be.

Once that is fixed up, you should be able to save and test saving. (though may be other issues on top)

@pranavvp16
Copy link
Contributor Author

thanks I made the changes accordingly and tests in electra_test_backbone pass now. The only thing that is not working is saving the model in h5 format in the collab notebook. I'm using the set_weight instead of assign while adding the weights in the layer, is it the reason for the error ?

@pranavvp16
Copy link
Contributor Author

I tried to work on saving of model in h5 format but it still fails with the error Unable to create dataset name already exists, but the model saves successfully in the .tf, any idea what I'm missing regarding saving the model in h5 format

@mattdangerw
Copy link
Member

I'm using the set_weight instead of assign while adding the weights in the layer, is it the reason for the error ?

I don't think that would be. I'm not exactly sure why we are getting the error. But we do want this is h5 format and not the tf format. I'll try to run the colab here soon and help debug!

@pranavvp16
Copy link
Contributor Author

Seems this implementation works fine as all the outputs in hidden state match in this notebook. The issue is with TransformersEncoder in layers in keras. as mentioned in this issue #1303. I have attached some notebooks supporting the error in the issue

@mattdangerw
Copy link
Member

Thanks! Very helpful and this is indeed a puzzling error. Not sure what exactly broke here.

@nkovela1 will take on that issue.

@pranavvp16
Copy link
Contributor Author

any updates on this, can we get this merged so I can add tokenizer in the next PR

@pranavvp16
Copy link
Contributor Author

as mentioned in #1307 the model saves successfully in the .weight.h5 format. Link to notebook demonstrating checkpoint conversion and saving of the model.

@mattdangerw
Copy link
Member

@pranavvp16 thanks! And good to know. Let's pull this in.

Very sorry about the delay here. It's been a hectic time for the team with the Keras 3 release last week. But it is out!

@nkovela1 would still be good to look at that Unable to create dataset name already exists whenever we have time.

@mattdangerw
Copy link
Member

/gcbrun

@mattdangerw
Copy link
Member

Thank you!!

@mattdangerw mattdangerw merged commit 24a70ba into keras-team:master Dec 4, 2023
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants