All Projects → hengyuan-hu → Rainbow

hengyuan-hu / Rainbow

A PyTorch implementation of Rainbow DQN agent

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Rainbow

Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-38.78%)
Mutual labels:  reinforcement-learning, dqn
Torchrl
Highly Modular and Scalable Reinforcement Learning
Stars: ✭ 102 (-30.61%)
Mutual labels:  reinforcement-learning, dqn
Categorical Dqn
A working implementation of the Categorical DQN (Distributional RL).
Stars: ✭ 90 (-38.78%)
Mutual labels:  reinforcement-learning, dqn
Deep traffic
MIT DeepTraffic top 2% solution (75.01 mph) 🚗.
Stars: ✭ 47 (-68.03%)
Mutual labels:  reinforcement-learning, dqn
Ctc Executioner
Master Thesis: Limit order placement with Reinforcement Learning
Stars: ✭ 112 (-23.81%)
Mutual labels:  reinforcement-learning, dqn
Reinforcepy
Collection of reinforcement learners implemented in python. Mainly including DQN and its variants
Stars: ✭ 54 (-63.27%)
Mutual labels:  reinforcement-learning, dqn
Reinforcement learning
강화학습에 대한 기본적인 알고리즘 구현
Stars: ✭ 100 (-31.97%)
Mutual labels:  reinforcement-learning, dqn
Chainerrl
ChainerRL is a deep reinforcement learning library built on top of Chainer.
Stars: ✭ 931 (+533.33%)
Mutual labels:  reinforcement-learning, dqn
Cartpole
OpenAI's cartpole env solver.
Stars: ✭ 107 (-27.21%)
Mutual labels:  reinforcement-learning, dqn
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+1943.54%)
Mutual labels:  reinforcement-learning, dqn
Ml In Tf
Get started with Machine Learning in TensorFlow with a selection of good reads and implemented examples!
Stars: ✭ 45 (-69.39%)
Mutual labels:  reinforcement-learning, dqn
Snake Ai Reinforcement
AI for Snake game trained from pixels using Deep Reinforcement Learning (DQN).
Stars: ✭ 123 (-16.33%)
Mutual labels:  reinforcement-learning, dqn
Deep Q Learning
Minimal Deep Q Learning (DQN & DDQN) implementations in Keras
Stars: ✭ 1,013 (+589.12%)
Mutual labels:  reinforcement-learning, dqn
Reinforcement learning
Reinforcement learning tutorials
Stars: ✭ 82 (-44.22%)
Mutual labels:  reinforcement-learning, dqn
Rainbow Is All You Need
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
Stars: ✭ 938 (+538.1%)
Mutual labels:  reinforcement-learning, dqn
Openaigym
Solving OpenAI Gym problems.
Stars: ✭ 98 (-33.33%)
Mutual labels:  reinforcement-learning, dqn
Tensorlayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
Stars: ✭ 6,796 (+4523.13%)
Mutual labels:  reinforcement-learning, dqn
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+514.97%)
Mutual labels:  reinforcement-learning, dqn
Reinforcement Learning
🤖 Implements of Reinforcement Learning algorithms.
Stars: ✭ 104 (-29.25%)
Mutual labels:  reinforcement-learning, dqn
Ros2learn
ROS 2 enabled Machine Learning algorithms
Stars: ✭ 119 (-19.05%)
Mutual labels:  reinforcement-learning, dqn

Pytorch Implementation of Rainbow

This repo is a partial implementation of the Rainbow agent published by researchers from DeepMind. The implementation is efficient and of high quality. It trains at a speed of 350 frames/s on a PC with a 3.5GHz CPU and GTX1080 GPU.

Rainbow is a deep Q learning based agent that combines a bunch of existing techiques such as dueling dqn, distributional dqn, etc. This repo currenly implemented the following dqn variants:

and it will need the following extensions to become a full "Rainbow":

  • Multi-step learning
  • Priority Replay

Hyperparameters

The hyperparameters in this repo follows the ones described in Rainbow paper as close as possible. However, there may still be some differences due to misunderstanding.

Performance

DQN agent often takes days to train. For sanity check, we can train a agent to play a simple game "boxing". Follwing is the learning curve of a dueling double dqn trained on boxing.

The agent almost solves boxing after around 12M frames, which is a good sign that the implementation is working.

To test the distributional DQN and Noisy Net, the agent is trained on "breakout" since distributional DQN performs significantly better than others on this game, reaching >400 scores rapidly while other DQN methods struggle to do so.

From the figure we see that the agent can reach >400 scores very rapidly and steadily. Note that the publicly reported numbers on papers are produced by training the agent for 200M frames while here it trains only for 50M frames due to computation cost.

Figures here are smoothed.

Future Works

We plan to implement multi-step learing and priority replay. Also, the current implementation uses a simple wrapper on the Arcade Learning Enviroment. We may want to shift to OpenAI gym for better visualization and video recording. On top of Rainbow, it will also be interesting to include other new techniques, such as Distributional RL with Quantile Regression.

Contributions and bug-catchings are welcome!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].