PA: This repository is in maintenance mode. No new features will be added but bugfixes and contributions are welcome. Please create a pull request with any fixes you have!
Dream to Control: Learning Behaviors by Latent Imagination
Paper: https://arxiv.org/abs/1912.01603
Project Website: https://danijar.com/project/dreamer/
TensorFlow 2 implementation: https://github.com/danijar/dreamer
TensorFlow 1 implementation: https://github.com/google-research/dreamer
Task | Average Return @ 1M | Dreamer Paper @ 1M |
---|---|---|
Acrobot Swingup | 69.54 | ~300 |
Cartpole Balance | 877.5 | ~990 |
Cartpole Balance Sparse | 814 | ~900 |
Cartpole Swingup | 633.6 | ~800 |
Cup Catch | 885.1 | ~990 |
Finger Turn Hard | 212.8 | ~550 |
Hopper Hop | 219 | ~250 |
Hopper Stand | 511.6 | ~990 |
Pendulum Swingup | 724.9 | ~760 |
Quadruped Run | 112.4 | ~450 |
Quadruped Walk | 52.82 | ~650 |
Reacher Easy | 962.8 | ~950 |
Walker Stand | 956.8 | ~990 |
Table 1. Dreamer PyTorch vs. Paper Implementation
- 1 random seed for PyTorch, 5 for the paper
- Code @ commit ccea6ae
- 37H for 1M steps on P100, 20H for 1M steps on V100
- Install Python 3.11
- Install Python Poetry
# clone the repo with rlpyt submodule
git clone --recurse-submodules https://github.com/juliusfrost/dreamer-pytorch.git
cd dreamer-pytorch
# Windows
cd setup/windows_cu118
# Linux
cd setup/linux_cu118
# install with poetry
poetry install
# install with pip
pip install -r requirements.txt
To run experiments on Atari, run python main.py
, and add any extra arguments you would like.
For example, to run with a single gpu set --cuda-idx 0
.
To run experiments on DeepMind Control, run python main_dmc.py
. You can also set any extra arguments here.
Experiments will automatically be stored in data/local/yyyymmdd/run_#
You can use tensorboard to keep track of your experiment.
Run tensorboard --logdir=data
.
If you have trouble reproducing any results, please raise a GitHub issue with your logs and results. Otherwise, if you have success, please share your trained model weights with us and with the broader community!
To run tests:
pytest tests
If you want additional code coverage information:
pytest tests --cov=dreamer
main.py
run atari experimentmain_dmc.py
run deepmind control experimentdreamer
dreamer codeagents
agent code used in samplingatari_dreamer_agent.py
Atari agentdmc_dreamer_agent.py
DeepMind Control agentdreamer_agent.py
basic sampling agent, exploration, contains shared methods
algos
algorithm specific codedreamer_algo.py
optimization algorithm, loss functions, hyperparametersreplay.py
replay buffer
envs
environment specific codeaction_repeat.py
action repeat wrapper. ported from tf2 dreameratari.py
Atari environments. ported from tf2 dreamerdmc.py
DeepMind Control Suite environment. ported from tf2 dreamerenv.py
base classes for environmentmodified_atari.py
unused atari environment from rlpytnormalize_actions.py
normalize actions wrapper. ported from tf2 dreamerone_hot.py
one hot action wrapper. ported from tf2 dreamertime_limit.py
Time limit wrapper. ported from tf2 dreamerwrapper.py
Base environment wrapper class
experiments
currently not usedmodels
all models used in the agentaction.py
Action modelagent.py
Summarizes all models for agent moduledense.py
Dense fully connected models. Used for Reward Model, Value Model, Discount Model.distribution.py
Distributions, TanH Bijectorobservation.py
Observation Modelrnns.py
Recurrent State Space Model
utils
utility functionslogging.py
logging videosmodule.py
freezing parameters