A collection of Tensorflow implementations of reinforcement learning models. Models are evaluated in OpenAI Gym environments. Any contribution/feedback is more than welcome. Disclaimer: These implementations are used for educational purposes only (i.e., to learn deep RL myself). There is no guarantee that the exact models will work on any of your particular RL problems without changes.
This codebase works in both Python 2.7 and 3.5. The models are implemented in Tensorflow 1.0.
Model | Code | References |
---|---|---|
Cross-Entropy Method | run_cem_cartpole | Cross-entropy method |
Tabular Q Learning | rl/tabular_q_learner | Sutton and Barto, Chapter 8 |
Deep Q Network | rl/neural_q_learner | Mnih et al. |
Double Deep Q Network | rl/neural_q_learner | van Hasselt et al. |
REINFORCE Policy Gradient | rl/pg_reinforce | Sutton et al. |
Actor-critic Policy Gradient | rl/pg_actor_critic | Minh et al. |
Deep Deterministic Policy Gradient | rl/pg_ddpg | Lillicrap et al. |
MIT