Bert model training procedure #73

camilomarino · 2022-05-31T19:29:03Z

Hey there,

I wanted to confirm some doubts that came to my mind when reviewing together the source code (https://github.com/nilmtk/nilmtk-contrib/blob/master/nilmtk_contrib/disaggregate/bert.py) and the paper (http://nilmworkshop.org/2020/proceedings/nilm20-final88.pdf):

The loss function: I think I understand that the loss implemented in the code is the MSE, but in the paper they propose more terms besides MSE.
The masking of the training data: I have not found that the masking of the input sequence is done as proposed in the paper.

I wanted to confirm if these two differences occur or I have misinterpreted the source code and/or the paper.

Best.

xuuurq · 2022-08-12T12:37:24Z

@camilomarino Hello, sorry to bother you.I would like to ask if you have solved this problem.The bert model in nilmtk-contrib is different from the code in the BERT4NILM paper, which is reflected in the loss function and mask processing. Is the bert model in nilmtk-contrib without mask processing?
thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bert model training procedure #73

Bert model training procedure #73

camilomarino commented May 31, 2022

xuuurq commented Aug 12, 2022

Bert model training procedure #73

Bert model training procedure #73

Comments

camilomarino commented May 31, 2022

xuuurq commented Aug 12, 2022