Bert not working properly #68

Rohitkr1997 · 2022-04-19T23:05:08Z

Can anyone upload the environment.yml or the versions of keras, tensorflow, nilmtk, nilmtk-contrib as bert requires keras.layers.multi_head_attention and it does not work properly with the versions of keras used after conda installing nilmtk and nilmtk-contrib. upgrading keras and tensorflow causes conflicts after which nilmtk cannot be used.

Rohitkr1997 · 2022-04-19T23:20:25Z

Or anyone who has working version of bert please can you upload the output of conda list.

paulfrank1997 · 2022-04-25T02:27:42Z

You need tensorflow2.5.0 or higher version of it. Since keras is already a inner part pf tensorflow2.5.0, you don't need to install keras individually.

Rohitkr1997 · 2022-04-25T05:54:09Z

I have tried using tensorflow version 2.6.0 but the environment has conflicts which creates problems. Could you please upload your environment.yml file or share the result of conda list so that I have a proper environment where everything works

Rohitkr1997 · 2022-04-25T07:39:10Z

If you share all the different packages you're using then I could just use your anaconda environment and avoid all the different conflicts that are in my environment.

paulfrank1997 · 2022-04-26T01:48:46Z

All you need is to uninstall Keras in the original environment, install tensorflow2.5.0 and upgrate hdpy into the latest version. Then you have to make some changes about the import of the modules. For example, you have to change "import keras.XXX" into "import tensorflow.keras.XXX", and change "import keras.layers.XXX" into "import tensorflow.keras.layers".

…

------------------ 原始邮件 ------------------ 发件人: "nilmtk/nilmtk-contrib" ***@***.***>; 发送时间: 2022年4月25日(星期一) 下午3:39 ***@***.***>; ***@***.******@***.***>; 主题: Re: [nilmtk/nilmtk-contrib] Bert not working properly (Issue #68) If you share all the different packages you're using then I could just use your anaconda environment and avoid all the different conflicts that are in my environment. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

xuuurq · 2022-08-09T08:47:41Z

@paulfrank1997 Hello, I installed tensorflow 2.5.0, and h5py is currently updated to the latest 3.7.0, but an error is reported after running, prompting ImportError: save_model requires h5py, I would like to know which version of h5py you installed. Thank you.Below is the result of running.
`D:\anaconda3\envs\nilmxu\python.exe D:/mywork/nilmtkcontribxu/nilmtk_contrib/disaggregate/fuhefenjie.py
2022-08-09 15:23:30.591863: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-08-09 15:23:30.591978: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-08-09 15:23:32.978920: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
Started training for BERT
Joint training for BERT
............... Loading Data for training ...................
Loading data for redd dataset
2022-08-09 15:23:32.999158: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3050 Laptop GPU computeCapability: 8.6
coreClock: 1.5GHz coreCount: 16 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 178.84GiB/s
2022-08-09 15:23:33.000126: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-08-09 15:23:33.000876: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2022-08-09 15:23:33.001617: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2022-08-09 15:23:33.002352: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2022-08-09 15:23:33.003067: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-08-09 15:23:33.003796: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2022-08-09 15:23:33.004519: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2022-08-09 15:23:33.005261: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-08-09 15:23:33.005365: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Loading building ... 2
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Dropping missing values
...............BERT partial_fit running...............
First model training for fridge
2022-08-09 15:23:35.090127: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-09 15:23:35.090573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-08-09 15:23:35.090662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]
Model: "sequential"

Layer (type) Output Shape Param #

conv1d (Conv1D) (None, 99, 16) 80

l_ppool (LPpool) (None, 50, 16) 0

token_and_position_embedding (None, 50, 16, 32) 643168

transformer_block (Transform (None, 50, 16, 32) 10656

flatten (Flatten) (None, 25600) 0

dropout_2 (Dropout) (None, 25600) 0

dense_2 (Dense) (None, 99) 2534499

dropout_3 (Dropout) (None, 99) 0

Total params: 3,188,403
Trainable params: 3,188,403
Non-trainable params: 0

Epoch 1/50
2022-08-09 15:23:46.743583: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
WARNING:tensorflow:Gradients do not exist for variables ['conv1d/kernel:0', 'conv1d/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['conv1d/kernel:0', 'conv1d/bias:0'] when minimizing the loss.
526/526 [==============================] - 259s 472ms/step - loss: 12.1465 - mse: 12.1465 - val_loss: 0.6670 - val_mse: 0.6670

Epoch 00001: val_loss improved from inf to 0.66698, saving model to BERT-temp-weights-74894.h5
Traceback (most recent call last):
File "D:/mywork/nilmtkcontribxu/nilmtk_contrib/disaggregate/fuhefenjie.py", line 54, in
api_res = API(experiment)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\nilmtk\api.py", line 46, in init
self.experiment()
File "D:\anaconda3\envs\nilmxu\lib\site-packages\nilmtk\api.py", line 91, in experiment
self.train_jointly(clf,d)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\nilmtk\api.py", line 240, in train_jointly
clf.partial_fit(self.train_mains,self.train_submeters)
File "D:\mywork\nilmtkcontribxu\nilmtk_contrib\disaggregate\bert.py", line 161, in partial_fit
model.fit(train_x,train_y,validation_data=(v_x,v_y),epochs=self.n_epochs,callbacks=[checkpoint],batch_size=self.batch_size)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\engine\training.py", line 1204, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\callbacks.py", line 410, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\callbacks.py", line 1376, in on_epoch_end
self._save_model(epoch=epoch, logs=logs)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\callbacks.py", line 1428, in _save_model
self.model.save(filepath, overwrite=True, options=self._options)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\engine\training.py", line 2087, in save
signatures, options, save_traces)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\saving\save.py", line 147, in save_model
model, filepath, overwrite, include_optimizer)
File "D:\anaconda3\envs\nilmxu\lib\site-packages\keras\saving\hdf5_format.py", line 79, in save_model_to_hdf5
raise ImportError('save_model requires h5py.')
ImportError: save_model requires h5py.
Closing remaining open files:C:\Users\xrq\AppData\Local\Temp\nilmtk-meg927ux.h5...doneD:/works/nilmtkcontrib/nilmtk_contrib/redd_low.hdf5...done
`

paulfrank1997 · 2022-08-09T12:21:40Z

@xuuurq I met the same problem as you did: "ImportError: save_model requires h5py". But after I upgrate hdpy into the latest version by "pip install --upgrade h5py", the problem got solved. The version of h5py I used is 3.6.0, and now everything worked fine.

xuuurq · 2022-08-11T13:15:20Z

@paulfrank1997 Sorry to bother you again, I think there are a few more questions:

Are you using tensorflow-gpu version 2.5.0?
The bert model in nilmtk-contrib is different from the code in the BERT4NILM paper, which is reflected in the loss function and mask processing. Is the bert model in nilmtk-contrib without mask processing?
In addition, I would like to ask you what do you think of the effect of the bert model in nilmtk-contrib?
Thank you very much for your answer.

paulfrank1997 · 2022-10-11T07:06:58Z

All you need is to uninstall Keras in the original environment, install tensorflow2.5.0 and upgrate h5py into the latest version. Then you have to make some changes about the import of the modules. For example, you have to change "import keras.XXX" into "import tensorflow.keras.XXX", and change "import keras.layers.XXX" into "import tensorflow.keras.layers".

…

------------------ 原始邮件 ------------------ 发件人: "nilmtk/nilmtk-contrib" ***@***.***>; 发送时间: 2022年4月25日(星期一) 下午3:39 ***@***.***>; ***@***.******@***.***>; 主题: Re: [nilmtk/nilmtk-contrib] Bert not working properly (Issue #68) If you share all the different packages you're using then I could just use your anaconda environment and avoid all the different conflicts that are in my environment. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bert not working properly #68

Bert not working properly #68

Rohitkr1997 commented Apr 19, 2022

Rohitkr1997 commented Apr 19, 2022

paulfrank1997 commented Apr 25, 2022

Rohitkr1997 commented Apr 25, 2022

Rohitkr1997 commented Apr 25, 2022

paulfrank1997 commented Apr 26, 2022 via email

xuuurq commented Aug 9, 2022

paulfrank1997 commented Aug 9, 2022

xuuurq commented Aug 11, 2022

paulfrank1997 commented Oct 11, 2022 via email

Bert not working properly #68

Bert not working properly #68

Comments

Rohitkr1997 commented Apr 19, 2022

Rohitkr1997 commented Apr 19, 2022

paulfrank1997 commented Apr 25, 2022

Rohitkr1997 commented Apr 25, 2022

Rohitkr1997 commented Apr 25, 2022

paulfrank1997 commented Apr 26, 2022 via email

xuuurq commented Aug 9, 2022

Layer (type) Output Shape Param #

dropout_3 (Dropout) (None, 99) 0

paulfrank1997 commented Aug 9, 2022

xuuurq commented Aug 11, 2022

paulfrank1997 commented Oct 11, 2022 via email