dongminlee94 / Deep_rl
PyTorch implementations of Deep Reinforcement Learning algorithms (DQN, DDQN, A2C, VPG, TRPO, PPO, DDPG, TD3, SAC, SAC-AEA)
Stars: ✭ 291
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Deep rl
Rad
RAD: Reinforcement Learning with Augmented Data
Stars: ✭ 268 (-7.9%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Tensorforce
Tensorforce: a TensorFlow library for applied reinforcement learning
Stars: ✭ 3,062 (+952.23%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Applied Reinforcement Learning
Reinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (-21.31%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Drq
DrQ: Data regularized Q
Stars: ✭ 268 (-7.9%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Learningx
Deep & Classical Reinforcement Learning + Machine Learning Examples in Python
Stars: ✭ 241 (-17.18%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+804.47%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Machine Learning Uiuc
🖥️ CS446: Machine Learning in Spring 2018, University of Illinois at Urbana-Champaign
Stars: ✭ 233 (-19.93%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (-40.21%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Roboleague
A car soccer environment inspired by Rocket League for deep reinforcement learning experiments in an adversarial self-play setting.
Stars: ✭ 236 (-18.9%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Learning To Communicate Pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Stars: ✭ 236 (-18.9%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Drl4recsys
Courses on Deep Reinforcement Learning (DRL) and DRL papers for recommender systems
Stars: ✭ 196 (-32.65%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Rlgraph
RLgraph: Modular computation graphs for deep reinforcement learning
Stars: ✭ 272 (-6.53%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Reinforcementlearning.jl
A reinforcement learning package for Julia
Stars: ✭ 192 (-34.02%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Gam
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).
Stars: ✭ 227 (-21.99%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Naf Tensorflow
"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Stars: ✭ 192 (-34.02%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (-21.65%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Accel Brain Code
The purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Stars: ✭ 166 (-42.96%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (-41.92%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-19.93%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+883.85%)
Mutual labels: reinforcement-learning, deep-reinforcement-learning
Deep Reinforcement Learning (DRL) Algorithms with PyTorch
This repository contains PyTorch implementations of deep reinforcement learning algorithms. The repository will soon be updated including the PyBullet environments!
Algorithms Implemented
- Deep Q-Network (DQN) (V. Mnih et al. 2015)
- Double DQN (DDQN) (H. Van Hasselt et al. 2015)
- Advantage Actor Critic (A2C)
- Vanilla Policy Gradient (VPG)
- Natural Policy Gradient (NPG) (S. Kakade et al. 2002)
- Trust Region Policy Optimization (TRPO) (J. Schulman et al. 2015)
- Proximal Policy Optimization (PPO) (J. Schulman et al. 2017)
- Deep Deterministic Policy Gradient (DDPG) (T. Lillicrap et al. 2015)
- Twin Delayed DDPG (TD3) (S. Fujimoto et al. 2018)
- Soft Actor-Critic (SAC) (T. Haarnoja et al. 2018)
- SAC with automatic entropy adjustment (SAC-AEA) (T. Haarnoja et al. 2018)
Environments Implemented
- Classic control environments (CartPole-v1, Pendulum-v0, etc.) (as described in here)
- MuJoCo environments (Hopper-v2, HalfCheetah-v2, Ant-v2, Humanoid-v2, etc.) (as described in here)
- PyBullet environments (HopperBulletEnv-v0, HalfCheetahBulletEnv-v0, AntBulletEnv-v0, HumanoidDeepMimicWalkBulletEnv-v1 etc.) (as described in here)
Results (MuJoCo, PyBullet)
MuJoCo environments
Hopper-v2
- Observation space: 8
- Action space: 3
HalfCheetah-v2
- Observation space: 17
- Action space: 6
Ant-v2
- Observation space: 111
- Action space: 8
Humanoid-v2
- Observation space: 376
- Action space: 17
PyBullet environments
HopperBulletEnv-v0
- Observation space: 15
- Action space: 3
HalfCheetahBulletEnv-v0
- Observation space: 26
- Action space: 6
AntBulletEnv-v0
- Observation space: 28
- Action space: 8
HumanoidDeepMimicWalkBulletEnv-v1
- Observation space: 197
- Action space: 36
Requirements
Usage
The repository's high-level structure is:
├── agents
└── common
├── results
├── data
└── graphs
└── save_model
1) To train the agents on the environments
To train all the different agents on PyBullet environments, follow these steps:
git clone https://github.com/dongminlee94/deep_rl.git
cd deep_rl
python run_bullet.py
For other environments, change the last line to run_cartpole.py
, run_pendulum.py
, run_mujoco.py
.
If you want to change configurations of the agents, follow this step:
python run_bullet.py \
--env=HumanoidDeepMimicWalkBulletEnv-v1 \
--algo=sac-aea \
--phase=train \
--render=False \
--load=None \
--seed=0 \
--iterations=200 \
--steps_per_iter=5000 \
--max_step=1000 \
--tensorboard=True \
--gpu_index=0
2) To watch the learned agents on the above environments
To watch all the learned agents on PyBullet environments, follow these steps:
python run_bullet.py \
--env=HumanoidDeepMimicWalkBulletEnv-v1 \
--algo=sac-aea \
--phase=test \
--render=True \
--load=envname_algoname_... \
--seed=0 \
--iterations=200 \
--steps_per_iter=5000 \
--max_step=1000 \
--tensorboard=False \
--gpu_index=0
You should copy the saved model name in save_model/envname_algoname_...
and paste the copied name in envname_algoname_...
. So the saved model will be load.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].