-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug-fix] When agent isn't training, don't clear update buffer #5205
Conversation
@@ -104,7 +104,9 @@ def save_replay_buffer(self) -> None: | |||
Save the training buffer's update buffer to a pickle file. | |||
""" | |||
filename = os.path.join(self.artifact_path, "last_replay_buffer.hdf5") | |||
logger.info(f"Saving Experience Replay Buffer to {filename}") | |||
logger.info( | |||
f"Saving Experience Replay Buffer to {filename} ({os.path.getsize(filename)} bytes)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Isn't this the old size? Maybe better to print after saving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't have been merged - let me revert. It will actually crash since initially there is no last_replay_buffer.hdf5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to two messages:
2021-03-31 14:28:45 INFO [trainer.py:107] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5...
2021-03-31 14:28:45 INFO [trainer.py:112] Saved Experience Replay Buffer (1591310 bytes).
So that users won't be wondering why the program is stuck (if the replay buffer is gigs in size).
trainer._append_to_update_buffer(agentbuffer_trajectory) | ||
assert trainer.update_buffer.num_experiences == (i + 1) * time_horizon | ||
|
||
# Check fhat if we append after stopping training, nothing happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Check fhat if we append after stopping training, nothing happens. | |
# Check that if we append after stopping training, nothing happens. |
* Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose (cherry picked from commit 63e7ad4)
Proposed change(s)
Previously, we cleared the replay buffer at each timestep if the trainer was done training. This was to avoid a large memory leak if one trainer was done, but the other wasn't. However, this meant that SAC would just not have a replay buffer at the end of the run.
We now don't clear the buffer, but rather just don't append to it.
Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
JIRA MLA-1251
Verified with pushblock:
Types of change(s)
Checklist
Other comments