Questions about the implementation of the loss function #7

Weiting-Gao · 2024-05-31T14:46:02Z

Thanks for sharing the code! I read the paper and also checked the code. I’m currently trying to adopt Diffmask to another dataset and have some questions regarding the code:

What is alpha (defined in sentiment_classification_sst_diffmask.py BertSentimentClassificationSSTDiffMask), is that Lagrangian multiplier mentioned in Eq(3) in the paper?
In SentimentClassificationSSTDiffMask, What is the expected_L0 in loss_g, why expected_L0 is negative? The negative value of expected_L0 makes loss_g negative. Is that correct?
I also don’t understand log_expected_L0() function in distributions.py. Can I find an explanation for this in the paper?
During the training step, you also calculate l0 (l0 = (expected_L0.exp() * mask).sum(-1) / mask.sum(-1)), what is this for, is this used for training?

Again, thanks for the wonderful work. Look forward to your reply!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about the implementation of the loss function #7

Questions about the implementation of the loss function #7

Weiting-Gao commented May 31, 2024

Questions about the implementation of the loss function #7

Questions about the implementation of the loss function #7

Comments

Weiting-Gao commented May 31, 2024