[bug-fix] When agent isn't training, don't clear update buffer #5205

ervteng · 2021-03-31T15:39:46Z

Proposed change(s)

Previously, we cleared the replay buffer at each timestep if the trainer was done training. This was to avoid a large memory leak if one trainer was done, but the other wasn't. However, this meant that SAC would just not have a replay buffer at the end of the run.

We now don't clear the buffer, but rather just don't append to it.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

JIRA MLA-1251

Verified with pushblock:

2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer.py:110] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5 (611283 bytes)
2021-03-31 11:35:27 INFO [trainer_controller.py:81] Saved Model

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

chriselion · 2021-03-31T16:36:32Z

ml-agents/mlagents/trainers/sac/trainer.py

@@ -104,7 +104,9 @@ def save_replay_buffer(self) -> None:
        Save the training buffer's update buffer to a pickle file.
        """
        filename = os.path.join(self.artifact_path, "last_replay_buffer.hdf5")
-        logger.info(f"Saving Experience Replay Buffer to {filename}")
+        logger.info(
+            f"Saving Experience Replay Buffer to {filename} ({os.path.getsize(filename)} bytes)"


nit: Isn't this the old size? Maybe better to print after saving.

This shouldn't have been merged - let me revert. It will actually crash since initially there is no last_replay_buffer.hdf5

Updated to two messages:

2021-03-31 14:28:45 INFO [trainer.py:107] Saving Experience Replay Buffer to results/ppo/PushBlock/last_replay_buffer.hdf5... 2021-03-31 14:28:45 INFO [trainer.py:112] Saved Experience Replay Buffer (1591310 bytes).

So that users won't be wondering why the program is stuck (if the replay buffer is gigs in size).

chriselion · 2021-03-31T16:37:01Z

ml-agents/mlagents/trainers/tests/test_rl_trainer.py

+        trainer._append_to_update_buffer(agentbuffer_trajectory)
+        assert trainer.update_buffer.num_experiences == (i + 1) * time_horizon
+
+    # Check fhat if we append after stopping training, nothing happens.


Suggested change

# Check fhat if we append after stopping training, nothing happens.

# Check that if we append after stopping training, nothing happens.

* Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose (cherry picked from commit 63e7ad4)

Ervin Teng added 2 commits March 31, 2021 11:31

Don't clear update buffer, but don't append to it either

3824cd6

Update changelog

a597074

ervteng requested a review from chriselion March 31, 2021 15:43

chriselion reviewed Mar 31, 2021

View reviewed changes

chriselion approved these changes Mar 31, 2021

View reviewed changes

Ervin Teng added 2 commits March 31, 2021 14:29

Address comments

af87806

Make experience replay buffer saving more verbose

6de790f

ervteng merged commit 63e7ad4 into main Apr 1, 2021

delete-merged-branch bot deleted the develop-fix-sac-save branch April 1, 2021 14:54

github-actions bot locked as resolved and limited conversation to collaborators Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug-fix] When agent isn't training, don't clear update buffer #5205

[bug-fix] When agent isn't training, don't clear update buffer #5205

ervteng commented Mar 31, 2021 •

edited

Loading

chriselion Mar 31, 2021

ervteng Mar 31, 2021

ervteng Mar 31, 2021

chriselion Mar 31, 2021

	# Check fhat if we append after stopping training, nothing happens.
	# Check that if we append after stopping training, nothing happens.

[bug-fix] When agent isn't training, don't clear update buffer #5205

[bug-fix] When agent isn't training, don't clear update buffer #5205

Conversation

ervteng commented Mar 31, 2021 • edited Loading

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

chriselion Mar 31, 2021

Choose a reason for hiding this comment

ervteng Mar 31, 2021

Choose a reason for hiding this comment

ervteng Mar 31, 2021

Choose a reason for hiding this comment

chriselion Mar 31, 2021

Choose a reason for hiding this comment

ervteng commented Mar 31, 2021 •

edited

Loading