All Projects → TianhongDai → Reinforcement Learning Algorithms

TianhongDai / Reinforcement Learning Algorithms

Licence: mit
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Reinforcement Learning Algorithms

Deep Reinforcement Learning With Pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Stars: ✭ 1,345 (+215.73%)
Mutual labels:  algorithm, deep-reinforcement-learning, dqn, ppo, actor-critic, trpo
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (-47.89%)
Mutual labels:  deep-reinforcement-learning, dqn, ddpg, actor-critic, trpo, ppo
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (-59.39%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, actor-critic, ddpg, trpo
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-45.31%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, actor-critic, ddpg
Deeprl Tensorflow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Stars: ✭ 319 (-25.12%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg, trpo
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-78.87%)
Mutual labels:  algorithm, dqn, ppo, ddpg, trpo
Deeprl algorithms
DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)
Stars: ✭ 97 (-77.23%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, trpo
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+605.16%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (-87.32%)
Mutual labels:  deep-reinforcement-learning, dqn, actor-critic, ppo
Autonomous Learning Library
A PyTorch library for building deep reinforcement learning agents.
Stars: ✭ 425 (-0.23%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Deep Reinforcement Learning Algorithms
31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Stars: ✭ 167 (-60.8%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Deeprl
Modularized Implementation of Deep RL Algorithms in PyTorch
Stars: ✭ 2,640 (+519.72%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Pytorch Rl
Deep Reinforcement Learning with pytorch & visdom
Stars: ✭ 745 (+74.88%)
Mutual labels:  deep-reinforcement-learning, dqn, actor-critic, trpo
Elegantrl
Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.
Stars: ✭ 575 (+34.98%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+381.46%)
Mutual labels:  deep-reinforcement-learning, dqn, ppo, ddpg
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (+3.76%)
Mutual labels:  deep-reinforcement-learning, dqn, ddpg, trpo
rl implementations
No description or website provided.
Stars: ✭ 40 (-90.61%)
Mutual labels:  deep-reinforcement-learning, dqn, ddpg, actor-critic
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+1530.99%)
Mutual labels:  dqn, ppo, actor-critic, ddpg
Tianshou
An elegant PyTorch deep reinforcement learning library.
Stars: ✭ 4,109 (+864.55%)
Mutual labels:  dqn, ppo, ddpg, trpo
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (-90.85%)
Mutual labels:  deep-reinforcement-learning, dqn, ddpg, ppo

Deep Reinforcement Learning Algorithms

logo
 
MIT License
This repository will implement the classic deep reinforcement learning algorithms by using PyTorch. The aim of this repository is to provide clear code for people to learn the deep reinforcemen learning algorithms. In the future, more algorithms will be added and the existing codes will also be maintained.

Current Implementations

  • [x] Deep Q-Learning Network (DQN)
    • [x] Basic DQN
    • [x] Double Q network
    • [x] Dueling Network Archtiecure
  • [x] Deep Deterministic Policy Gradient (DDPG)
  • [x] Advantage Actor-Critic (A2C)
  • [x] Trust Region Policy Gradient (TRPO)
  • [x] Proximal Policy Optimization (PPO)
  • [ ] Actor Critic using Kronecker-Factored Trust Region (ACKTR)
  • [x] Soft Actor-Critic (SAC)

Update Info

🚩 2018-10-17 - In this update, most of algorithms have been imporved and add more experiments with plots (except for DPPG). The PPO now supports atari-games and mujoco-env. The TRPO is much stable and can have better results!
 
🚩 2019-07-15 - In this update, the installation for the openai baseline is no longer needed. I have intergated useful functions in the rl__utils module. DDPG is also re-implemented and support more results. README file has been modified. The code structure also has tiny adjustment.
 
🚩 2019-07-26 - In this update, the revised repository will be public. In order to have a light size of the repository. I rebuild the repository and the previous version is deleted. But I will make a backup in the google driver.
 
🚩 2019-11-13 - Change the code structure of the repo, all algorithms have been moved to rl_algorithms/ folder. Add soft actor critic method, the expriments plots will be added soon.

TODO List

  • [ ] add prioritized experience replay.
  • [x] in the future, we will not use openai baseline's pre-processing functions.
  • [x] improve the DDPG - I have already implemented a pytorch Hindsight Experience Replay (HER) with DDPG, you chould check them here.
  • [ ] update pre-trained models in google driver (will update soon!).

Requirments

  • pytorch=1.0.1
  • gym=0.12.5
  • mpi4py
  • mujoco-py
  • opencv-python
  • cloudpickle

Installation

  1. Install our rl_utils module:
pip install -e .
  1. Install mujoco: please follow the instruction of official website.
  2. Install Atari and Box2d:
sudo apt-get install swig or brew install swig
pip install gym[atari]
pip install gym[box2d]
pip install box2d box2d-kengz

Instruction

  1. Train the agent (details could be found in each folder):
cd rl_algorithms/<target_algo_folder>/
python train.py --<arguments you need>
  1. Play the demo:
cd rl_algorithms/<target_algo_folder>/
python demo.py --<arguments you need>

Code Structures

  1. rl algorithms:
  • arguments.py: contain the parameters used in the training.
  • <rl-name>_agent.py: contain the most important part of the reinforcement learning algorithms.
  • models.py: the network structure for the policy and value function.
  • utils.py: some useful function, such as select actions.
  • train.py: the script to train the agent.
  • demo.py: visualize the trained agent.
  1. rl_utils module:
  • env_wrapper/: contain the pre-processing function for the atari games and wrapper to create environments.
  • experience_replay/: contain the experience replay for the off-policy rl algorithms.
  • logger/: contain functions to take down log infos during training.
  • mpi_utils/: contain the tools for the mpi training.
  • running_filter/: contain the running mean filter functions to normalize the observation in the mujoco environments.
  • seeds/: contain function to setup the random seeds for the training for reproducibility.

Example Results

1. DQN algorithms

dqn_performance

2. DDPG

dueling_network

3. A2C

a2c

4. TRPO

trpo

5. PPO

ppo

6. SAC

sac

Demos

Atari Env (BreakoutNoFrameskip-v4) Box2d Env (BipedalWalker-v2) Mujoco Env (Hopper-v2)

Acknowledgement

Related Papers

[1] A Brief Survey of Deep Reinforcement Learning
[2] The Beta Policy for Continuous Control Reinforcement Learning
[3] Playing Atari with Deep Reinforcement Learning
[4] Deep Reinforcement Learning with Double Q-learning
[5] Dueling Network Architectures for Deep Reinforcement Learning
[6] Continuous control with deep reinforcement learning
[7] Continuous Deep Q-Learning with Model-based Acceleration
[8] Asynchronous Methods for Deep Reinforcement Learning
[9] Trust Region Policy Optimization
[10] Proximal Policy Optimization Algorithms
[11] Soft Actor-Critic Algorithms and Applications
[12] Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].