Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug-fix] When agent isn't training, don't clear update buffer #5205

Merged
merged 4 commits into from
Apr 1, 2021

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Mar 31, 2021

Proposed change(s)

Previously, we cleared the replay buffer at each timestep if the trainer was done training. This was to avoid a large memory leak if one trainer was done, but the other wasn't. However, this meant that SAC would just not have a replay buffer at the end of the run.

We now don't clear the buffer, but rather just don't append to it.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

JIRA MLA-1251

Verified with pushblock:

2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer_controller.py:81] Saved Model

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@ervteng ervteng requested a review from chriselion March 31, 2021 15:43
@@ -104,7 +104,9 @@ def save_replay_buffer(self) -> None:
Save the training buffer's update buffer to a pickle file.
"""
filename = os.path.join(self.artifact_path, "last_replay_buffer.hdf5")
logger.info(f"Saving Experience Replay Buffer to {filename}")
logger.info(
f"Saving Experience Replay Buffer to {filename} ({os.path.getsize(filename)} bytes)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Isn't this the old size? Maybe better to print after saving.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't have been merged - let me revert. It will actually crash since initially there is no last_replay_buffer.hdf5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to two messages:

2021-03-31 14:28:45 INFO [trainer.py:107] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5...
2021-03-31 14:28:45 INFO [trainer.py:112] Saved Experience Replay Buffer (1591310 bytes).

So that users won't be wondering why the program is stuck (if the replay buffer is gigs in size).

trainer._append_to_update_buffer(agentbuffer_trajectory)
assert trainer.update_buffer.num_experiences == (i + 1) * time_horizon

# Check fhat if we append after stopping training, nothing happens.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Check fhat if we append after stopping training, nothing happens.
# Check that if we append after stopping training, nothing happens.

@ervteng ervteng merged commit 63e7ad4 into main Apr 1, 2021
@delete-merged-branch delete-merged-branch bot deleted the develop-fix-sac-save branch April 1, 2021 14:54
ervteng pushed a commit that referenced this pull request Apr 8, 2021
* Don't clear update buffer, but don't append to it either

* Update changelog

* Address comments

* Make experience replay buffer saving more verbose

(cherry picked from commit 63e7ad4)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants