All Projects → dongminlee94 → Deep_rl

dongminlee94 / Deep_rl

PyTorch implementations of Deep Reinforcement Learning algorithms (DQN, DDQN, A2C, VPG, TRPO, PPO, DDPG, TD3, SAC, SAC-AEA)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Deep rl

Rad
RAD: Reinforcement Learning with Augmented Data
Stars: ✭ 268 (-7.9%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Tensorforce
Tensorforce: a TensorFlow library for applied reinforcement learning
Stars: ✭ 3,062 (+952.23%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Applied Reinforcement Learning
Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (-21.31%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Drq
DrQ: Data regularized Q
Stars: ✭ 268 (-7.9%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Learningx
Deep & Classical Reinforcement Learning + Machine Learning Examples in Python
Stars: ✭ 241 (-17.18%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+804.47%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Machine Learning Uiuc
🖥️ CS446: Machine Learning in Spring 2018, University of Illinois at Urbana-Champaign
Stars: ✭ 233 (-19.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (-40.21%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Roboleague
A car soccer environment inspired by Rocket League for deep reinforcement learning experiments in an adversarial self-play setting.
Stars: ✭ 236 (-18.9%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Learning To Communicate Pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Stars: ✭ 236 (-18.9%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Drl4recsys
Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems
Stars: ✭ 196 (-32.65%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Rlgraph
RLgraph: Modular computation graphs for deep reinforcement learning
Stars: ✭ 272 (-6.53%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Reinforcementlearning.jl
A reinforcement learning package for Julia
Stars: ✭ 192 (-34.02%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Gam
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).
Stars: ✭ 227 (-21.99%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Naf Tensorflow
"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Stars: ✭ 192 (-34.02%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (-21.65%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Accel Brain Code
The purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Stars: ✭ 166 (-42.96%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (-41.92%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-19.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+883.85%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning

Deep Reinforcement Learning (DRL) Algorithms with PyTorch

This repository contains PyTorch implementations of deep reinforcement learning algorithms. The repository will soon be updated including the PyBullet environments!

Algorithms Implemented

  1. Deep Q-Network (DQN) (V. Mnih et al. 2015)
  2. Double DQN (DDQN) (H. Van Hasselt et al. 2015)
  3. Advantage Actor Critic (A2C)
  4. Vanilla Policy Gradient (VPG)
  5. Natural Policy Gradient (NPG) (S. Kakade et al. 2002)
  6. Trust Region Policy Optimization (TRPO) (J. Schulman et al. 2015)
  7. Proximal Policy Optimization (PPO) (J. Schulman et al. 2017)
  8. Deep Deterministic Policy Gradient (DDPG) (T. Lillicrap et al. 2015)
  9. Twin Delayed DDPG (TD3) (S. Fujimoto et al. 2018)
  10. Soft Actor-Critic (SAC) (T. Haarnoja et al. 2018)
  11. SAC with automatic entropy adjustment (SAC-AEA) (T. Haarnoja et al. 2018)

Environments Implemented

  1. Classic control environments (CartPole-v1, Pendulum-v0, etc.) (as described in here)
  2. MuJoCo environments (Hopper-v2, HalfCheetah-v2, Ant-v2, Humanoid-v2, etc.) (as described in here)
  3. PyBullet environments (HopperBulletEnv-v0, HalfCheetahBulletEnv-v0, AntBulletEnv-v0, HumanoidDeepMimicWalkBulletEnv-v1 etc.) (as described in here)

Results (MuJoCo, PyBullet)

MuJoCo environments

Hopper-v2

  • Observation space: 8
  • Action space: 3

HalfCheetah-v2

  • Observation space: 17
  • Action space: 6

Ant-v2

  • Observation space: 111
  • Action space: 8

Humanoid-v2

  • Observation space: 376
  • Action space: 17

PyBullet environments

HopperBulletEnv-v0

  • Observation space: 15
  • Action space: 3

HalfCheetahBulletEnv-v0

  • Observation space: 26
  • Action space: 6

AntBulletEnv-v0

  • Observation space: 28
  • Action space: 8

HumanoidDeepMimicWalkBulletEnv-v1

  • Observation space: 197
  • Action space: 36

Requirements

Usage

The repository's high-level structure is:

├── agents                    
    └── common 
├── results  
    ├── data 
    └── graphs        
└── save_model

1) To train the agents on the environments

To train all the different agents on PyBullet environments, follow these steps:

git clone https://github.com/dongminlee94/deep_rl.git
cd deep_rl
python run_bullet.py

For other environments, change the last line to run_cartpole.py, run_pendulum.py, run_mujoco.py.

If you want to change configurations of the agents, follow this step:

python run_bullet.py \
    --env=HumanoidDeepMimicWalkBulletEnv-v1 \
    --algo=sac-aea \
    --phase=train \
    --render=False \
    --load=None \
    --seed=0 \
    --iterations=200 \
    --steps_per_iter=5000 \
    --max_step=1000 \
    --tensorboard=True \
    --gpu_index=0

2) To watch the learned agents on the above environments

To watch all the learned agents on PyBullet environments, follow these steps:

python run_bullet.py \
    --env=HumanoidDeepMimicWalkBulletEnv-v1 \
    --algo=sac-aea \
    --phase=test \
    --render=True \
    --load=envname_algoname_... \
    --seed=0 \
    --iterations=200 \
    --steps_per_iter=5000 \
    --max_step=1000 \
    --tensorboard=False \
    --gpu_index=0

You should copy the saved model name in save_model/envname_algoname_... and paste the copied name in envname_algoname_.... So the saved model will be load.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].