You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is disc_param_avg used for here? Why its updates should be considered into parameters update. If gradients already be got by disc_param_updates, why couldn't we directly apply these gradients to the parameters in layers?
Thanks for your answer very much!
The text was updated successfully, but these errors were encountered:
It seems that disc_param_avg could be used to calculate the "historical averaging" regularization term in both the discriminator and the generator's costs. See section 3.3 of https://arxiv.org/pdf/1606.03498.pdf
However, since disc_param_avg appears nowhere in the cost in this code, I don't think the authors have implemented historical averaging here.
Instead, I think disc_param_avg is just a temporally smoothed set of parameters. It's used at test time to give better, more stable results:
@christiancosgrove Thanks very much for your explanation. Now I understand its function here. However, I thought it has been applied during the process.
What is disc_param_avg used for here? Why its updates should be considered into parameters update. If gradients already be got by disc_param_updates, why couldn't we directly apply these gradients to the parameters in layers?
Thanks for your answer very much!
The text was updated successfully, but these errors were encountered: