Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assign requires shapes of both tensors to match while train_model ner_ontonotes_bert_mult #839

Closed
Melmarn opened this issue May 14, 2019 · 8 comments

Comments

@Melmarn
Copy link

Melmarn commented May 14, 2019

Hello,
I try to train_model(configs.ner.ner_ontonotes_bert_mult). Downloaded model works well, test example passed
ner_model = build_model(configs.ner.ner_ontonotes_bert_mult, download=True) ner_model(['Чемпионат мира по кёрлингу пройдёт в Антананариву'])
I put my data in the folder ~/.deeppavlov/downloads/ontonotes/ determined by the code print(config_dict['dataset_reader']['data_path']) . I installed BERT requirements bert_dp.txt.
I have an error while running train_model:

ERROR in 'deeppavlov.core.common.params'['params'] at line 110: Exception in <class 'deeppavlov.models.bert.bert_ner.BertNerModel'>
Traceback (most recent call last):
File "/Users/ds/ml_works/marina36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/Users/ds/ml_works/marina36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/Users/ds/ml_works/marina36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [768,3] rhs shape= [768,37]
[[Node: save/Assign_360 = Assign[T=DT_FLOAT, _class=["loc:@ner/output_dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ner/output_dense/kernel, save/RestoreV2:360)]]
...

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [768,3] rhs shape= [768,37]
[[Node: save/Assign_360 = Assign[T=DT_FLOAT, _class=["loc:@ner/output_dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ner/output_dense/kernel, save/RestoreV2:360)]]

Thanks in advance.

@mu-arkhipov
Copy link
Contributor

mu-arkhipov commented May 14, 2019

The model was laded from the old path. the BERT model in the config has load_path and save_path. The old model has different number of tags in the top layer. You should remove the old model from the load_path.

@Melmarn
Copy link
Author

Melmarn commented May 14, 2019

Thanks! It started to train, but new error has occurred:

input sequence after bert tokenization shouldn't exceed 512 tokens.
Should I set up something else in config file?

@mu-arkhipov
Copy link
Contributor

Sorry, but the BERT model has positional embeddings only for first 512 subtokens. So, the model can't work with longer sequences. It is a deliberate architecture restriction. Subtokens are produced by WordPiece tokenizer (BPE). 512 subtokens correspond approximately to 300-350 regular tokens for multilingual model. Make sure that you performed sentence tokenization before dumping the data. Every sentence in the dumped data should be separated by an empty line.

@Melmarn
Copy link
Author

Melmarn commented May 15, 2019

Great! It works, thanks a lot.

@Melmarn Melmarn closed this as completed May 15, 2019
@ShaleenAg
Copy link

ShaleenAg commented Feb 5, 2020

The model was laded from the old path. the BERT model in the config has load_path and save_path. The old model has different number of tags in the top layer. You should remove the old model from the load_path.

How do I remove it? I'm experiencing the same error

I'm new at this and I can't find how to open the config file on deepavlov docs

@IDrei
Copy link

IDrei commented Mar 23, 2020

The model was laded from the old path. the BERT model in the config has load_path and save_path. The old model has different number of tags in the top layer. You should remove the old model from the load_path.

How do I remove it? I'm experiencing the same error

I'm new at this and I can't find how to open the config file on deepavlov docs

Have you removed old model? And how ?))

@SAYAKGHANTA
Copy link

The model was laded from the old path. the BERT model in the config has load_path and save_path. The old model has different number of tags in the top layer. You should remove the old model from the load_path.

Can you please tell me how to update this path. I tried but not able to do it

@SAYAKGHANTA
Copy link

Traceback (most recent call last):
File "c:/Users/sghanta/Desktop/NER/train_model.py", line 12, in
ner_model = train_model(configs.ner.ner_ontonotes_bert_mult)
File "C:\Users\sghanta\Desktop\NER\env\lib\site-packages\deeppavlov_init_.py", line 29, in train_model
train_evaluate_model_from_config(config, download=download, recursive=recursive)
File "C:\Users\sghanta\Desktop\NER\env\lib\site-packages\deeppavlov\core\commands\train.py", line 92, in train_evaluate_model_from_config
data = read_data_by_config(config)
File "C:\Users\sghanta\Desktop\NER\env\lib\site-packages\deeppavlov\core\commands\train.py", line 58, in read_data_by_config
return reader.read(data_path, **reader_config)
File "C:\Users\sghanta\Desktop\NER\env\lib\site-packages\deeppavlov\dataset_readers\conll2003_reader.py", line 56, in read
dataset[name] = self.parse_ner_file(file_name)
File "C:\Users\sghanta\Desktop\NER\env\lib\site-packages\deeppavlov\dataset_readers\conll2003_reader.py", line 106, in parse_ner_file
raise Exception(f"Input is not valid {line}")
Exception: Input is not valid
O

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants