All Projects → miyosuda → Async_deep_reinforce

miyosuda / Async_deep_reinforce

Licence: apache-2.0
Asynchronous Methods for Deep Reinforcement Learning

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Async deep reinforce

Rl a3c pytorch
A3C LSTM Atari with Pytorch plus A3G design
Stars: ✭ 482 (-14.69%)
Mutual labels:  reinforcement-learning, a3c
Rlcycle
A library for ready-made reinforcement learning agents and reusable components for neat prototyping
Stars: ✭ 184 (-67.43%)
Mutual labels:  reinforcement-learning, a3c
Machin
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
Stars: ✭ 145 (-74.34%)
Mutual labels:  reinforcement-learning, a3c
Reinforcement learning
Reinforcement learning tutorials
Stars: ✭ 82 (-85.49%)
Mutual labels:  reinforcement-learning, a3c
Rl4j
Deep Reinforcement Learning for the JVM (Deep-Q, A3C)
Stars: ✭ 330 (-41.59%)
Mutual labels:  reinforcement-learning, a3c
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+431.68%)
Mutual labels:  reinforcement-learning, a3c
Tensorflow Rl
Implementations of deep RL papers and random experimentation
Stars: ✭ 176 (-68.85%)
Mutual labels:  reinforcement-learning, a3c
A3c
MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"
Stars: ✭ 9 (-98.41%)
Mutual labels:  reinforcement-learning, a3c
Deeprl Tensorflow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Stars: ✭ 319 (-43.54%)
Mutual labels:  reinforcement-learning, a3c
Pysc2 Agents
This is a simple implementation of DeepMind's PySC2 RL agents.
Stars: ✭ 262 (-53.63%)
Mutual labels:  reinforcement-learning, a3c
Torch Ac
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Stars: ✭ 70 (-87.61%)
Mutual labels:  reinforcement-learning, a3c
Deep Rl Keras
Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)
Stars: ✭ 395 (-30.09%)
Mutual labels:  reinforcement-learning, a3c
Policy Gradient Methods
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC
Stars: ✭ 54 (-90.44%)
Mutual labels:  reinforcement-learning, a3c
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-79.12%)
Mutual labels:  reinforcement-learning, a3c
Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+55.58%)
Mutual labels:  reinforcement-learning, a3c
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+263.01%)
Mutual labels:  reinforcement-learning, a3c
Bombora
My experimentations with Reinforcement Learning in Pytorch
Stars: ✭ 18 (-96.81%)
Mutual labels:  reinforcement-learning, a3c
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+60%)
Mutual labels:  reinforcement-learning, a3c
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+406.73%)
Mutual labels:  reinforcement-learning, a3c
Ai Blog
Accompanying repository for Let's make a DQN / A3C series.
Stars: ✭ 351 (-37.88%)
Mutual labels:  reinforcement-learning, a3c

async_deep_reinforce

Asynchronous deep reinforcement learning

About

An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

http://arxiv.org/abs/1602.01783

Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is implemented with TensorFlow. Both A3C-FF and A3C-LSTM are implemented.

Learning result movment after 26 hours (A3C-FF) is like this.

Learning result after 26 hour

Any advice or suggestion is strongly welcomed in issues thread.

https://github.com/miyosuda/async_deep_reinforce/issues/1

How to build

First we need to build multi thread ready version of Arcade Learning Enviroment. I made some modification to it to run it on multi thread enviroment.

$ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
$ cd Arcade-Learning-Environment
$ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=OFF .
$ make -j 4

$ pip install .

I recommend to install it on VirtualEnv environment.

How to run

To train,

$python a3c.py

To display the result with game play,

$python a3c_disp.py

Using GPU

To enable gpu, change "USE_GPU" flag in "constants.py".

When running with 8 parallel game environemts, speeds of GPU (GTX980Ti) and CPU(Core i7 6700) were like this. (Recorded with LOCAL_T_MAX=20 setting.)

type A3C-FF A3C-LSTM
GPU 1722 steps per sec 864 steps per sec
CPU 1077 steps per sec 540 steps per sec

Result

Score plots of local threads of pong were like these. (with GTX980Ti)

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM T=5

A3C-LSTM LOCAL_T_MAX = 20

A3C-LSTM T=20

Scores are not averaged using global network unlike the original paper.

Requirements

  • TensorFlow r1.0
  • numpy
  • cv2
  • matplotlib

References

This project uses setting written in muupan's wiki [muuupan/async-rl] (https://github.com/muupan/async-rl/wiki)

Acknowledgements

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].