Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to decide values for noise sd and number of samples per gradient estimation? #23

Open
ben-arnao opened this issue Feb 12, 2021 · 0 comments

Comments

@ben-arnao
Copy link

Obviously very problem dependent as usual, but for the noise of the perturbations this is set to the 0.02 standard dev in the config (also in the paper apparently we don't want to lower this, which seems a little odd as one would think we want to do this as we converge to a maxima, or at the very least we'd want to lower the SD along with L2 coefficient if we reduce LR).

Also seems like they do around 10000 episodes per a one gradient estimation/one optimization step. Was wondering why the researchers arrived at these values. Both seem critical to the performance/efficient of this type of RIL. Is there any rough guidelines or intuition for setting these values? Or any sort of empirical evidence/studies to reference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant