This suite implements several model-free off-policy deep reinforcement learning algorithms for discrete and continuous action spaces in PyTorch.
Name | Single-/Multi-Agent | Action Space | Source |
---|---|---|---|
DQN | Single | Discrete | Mnih et. al. 2015 |
Double DQN | Single | Discrete | van Hasselt et. al. 2016 |
Bootstrapped DQN | Single | Discrete | Osband et. al. 2016 |
Ensemble DQN | Single | Discrete | Anschel et. al 2017 |
MaxMin DQN | Single | Discrete | Lan et. al. 2020 |
SCDQN | Single | Discrete | Zhu et. al. 2021 |
ACCDDQN | Single | Discrete | Jiang et. al. 2021 |
KE-BootDQN | Single | Discrete | Waltz, Okhrin 2022 |
DDPG | Single | Continuous | Lillicrap et. al. 2015 |
LSTM-DDPG | Single | Continuous | Meng et. al. 2021 |
TD3 | Single | Continuous | Fujimoto et. al. 2018 |
LSTM-TD3 | Single | Continuous | Meng et. al. 2021 |
SAC | Single | Continuous | Haarnoja et. al. 2019 |
LSTM-SAC | Single | Continuous | Own Implementation following Meng et. al. 2021 |
TQC | Single | Continuous | Kuznetsov et. al. 2020 |
MADDPG | Multi | Continuous | Lowe et. al. 2017 |
MATD3 | Multi | Continuous | Ackermann et. al. 2019 |
DiscMADDPG | Multi | Discrete | Gumbel-Softmax discretization of MADDPG |
DiscMATD3 | Multi | Discrete | Gumbel-Softmax discretization of MATD3 |
To use basic functions of this package you need to have at least installed
In order to use the package to its full capabilites, it is recommended to install the following dependencies:
The package is set up to be used as an editable install, which makes prototyping very easy and does not require you to rebuild the package after every change.
Install it using pip:
$ git clone https://github.com/MarWaltz/TUD_RL.git
$ cd TUD_RL/
$ pip install -e .
Note that a normal package install via pip is not supported at the moment and will lead to import errors.
In order to train in an environment using this package, you must specify a training configuration .yaml
file and place it in one of the two folders in /tud_rl/configs
depending on the type of action space (discrete, continuous).
You also find a variety of different example configuration files in this folder.
For an increased flexibility, please make yourself familiar with the parameters each algorithm offers.
The recommended way to train or visualize your environment is to use the tud_rl
package as a module using the python -m
flag.
To run the package, you have to supply the following flags to the module:
Training mode can be either train
or visualize
. If you want to visualize your environment, you must ensure that training weights are supplied in the config file:
For discrete training the config entry looks like:
---
dqn_weights: /path/to/weights.pth
For continuous training you must supply both actor and critic weights:
---
actor_weights: /path/to/actor_weights.pth
critic_weights: /path/to/critic_weights.pth
Name of your configuration file placed in either /tud_rl/configs/discrete_actions
or /tud_rl/configs/continuous_actions
.
Name of the agent you want to use for training or visualization. The specified agent must be a present in your configuration file.
$ python -m tud_rl -m train -c myconfig.yaml -a DDQN
This package provides an interface to specify your own custom training environment based on the OpenAI framework. Once this is done, no further adjustment is needed and you can start training as described in the section above.
In order to integrate your own environment you have to create a new file in /tud_rl/envs/_envs
. There you need to specify a class for your environment that implements at least three methods as seen in the following blueprint:
# This file is named Dummy.py
import gym
class MyEnv(gym.Env):
def __init__(self):
super().__init__()
"""Your code"""
def reset():
"""reset your env"""
pass
def step(action):
"""Perform step in environment"""
pass
def render(): # optional
"""Render your env to an output"""
pass
See this blog article for a detailed explanation on how to set up your own gym environment
Once your environment is specified, you need to register it with gym in order to add it to the list of callable environments. The registration is done in the /tud_rl/__init__.py
file by selecting the name your environment will be called with, and the entry point for gym to know where your custom environment is located (loc is the fixed base location while the rest is the class name of your environment):
register(
id="MyEnv-v0",
entry_point= loc + "MyEnv",
)
You can now select your environment in your configuration file under the env
category.
Example (incomplete):
---
env:
name: MyEnv-v0
max_episode_steps: 100
state_type: feature
wrappers: []
wrapper_kwargs: {}
env_kwargs: {}
info: ""
agent:
DQN: {}
If you use this code in one of your projects or papers, please cite it as follows.
@misc{TUDRL,
author = {Waltz, Martin and Paulig, Niklas},
title = {RL Dresden Algorithm Suite},
year = {2022},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/MarWaltz/TUD_RL}}
}