Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → miyosuda → Async_deep_reinforce

miyosuda / Async_deep_reinforce

Licence: apache-2.0

Asynchronous Methods for Deep Reinforcement Learning

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning tensorflow reinforcement-learning a3c

Projects that are alternatives of or similar to Async deep reinforce

A3C LSTM Atari with Pytorch plus A3G design

Stars: ✭ 482 (-14.69%)

Mutual labels: reinforcement-learning, a3c

A library for ready-made reinforcement learning agents and reusable components for neat prototyping

Stars: ✭ 184 (-67.43%)

Mutual labels: reinforcement-learning, a3c

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Stars: ✭ 145 (-74.34%)

Mutual labels: reinforcement-learning, a3c

Reinforcement learning

Reinforcement learning tutorials

Stars: ✭ 82 (-85.49%)

Mutual labels: reinforcement-learning, a3c

Deep Reinforcement Learning for the JVM (Deep-Q, A3C)

Stars: ✭ 330 (-41.59%)

Mutual labels: reinforcement-learning, a3c

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+431.68%)

Mutual labels: reinforcement-learning, a3c

Implementations of deep RL papers and random experimentation

Stars: ✭ 176 (-68.85%)

Mutual labels: reinforcement-learning, a3c

MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"

Stars: ✭ 9 (-98.41%)

Mutual labels: reinforcement-learning, a3c

Deeprl Tensorflow2

🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2

Stars: ✭ 319 (-43.54%)

Mutual labels: reinforcement-learning, a3c

This is a simple implementation of DeepMind's PySC2 RL agents.

Stars: ✭ 262 (-53.63%)

Mutual labels: reinforcement-learning, a3c

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Stars: ✭ 70 (-87.61%)

Mutual labels: reinforcement-learning, a3c

Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)

Stars: ✭ 395 (-30.09%)

Mutual labels: reinforcement-learning, a3c

Policy Gradient Methods

Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC

Stars: ✭ 54 (-90.44%)

Mutual labels: reinforcement-learning, a3c

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-79.12%)

Mutual labels: reinforcement-learning, a3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+55.58%)

Mutual labels: reinforcement-learning, a3c

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+263.01%)

Mutual labels: reinforcement-learning, a3c

My experimentations with Reinforcement Learning in Pytorch

Stars: ✭ 18 (-96.81%)

Mutual labels: reinforcement-learning, a3c

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+60%)

Mutual labels: reinforcement-learning, a3c

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+406.73%)

Mutual labels: reinforcement-learning, a3c

Accompanying repository for Let's make a DQN / A3C series.

Stars: ✭ 351 (-37.88%)

Mutual labels: reinforcement-learning, a3c

View All Similar Projects ➔

async_deep_reinforce

Asynchronous deep reinforcement learning

About

An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

http://arxiv.org/abs/1602.01783

Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is implemented with TensorFlow. Both A3C-FF and A3C-LSTM are implemented.

Learning result movment after 26 hours (A3C-FF) is like this.

Any advice or suggestion is strongly welcomed in issues thread.

https://github.com/miyosuda/async_deep_reinforce/issues/1

How to build

First we need to build multi thread ready version of Arcade Learning Enviroment. I made some modification to it to run it on multi thread enviroment.

$ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
$ cd Arcade-Learning-Environment
$ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=OFF .
$ make -j 4

$ pip install .

I recommend to install it on VirtualEnv environment.

How to run

To train,

$python a3c.py

To display the result with game play,

$python a3c_disp.py

Using GPU

To enable gpu, change "USE_GPU" flag in "constants.py".

When running with 8 parallel game environemts, speeds of GPU (GTX980Ti) and CPU(Core i7 6700) were like this. (Recorded with LOCAL_T_MAX=20 setting.)

type	A3C-FF	A3C-LSTM
GPU	1722 steps per sec	864 steps per sec
CPU	1077 steps per sec	540 steps per sec

Result

Score plots of local threads of pong were like these. (with GTX980Ti)

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM LOCAL_T_MAX = 20

Scores are not averaged using global network unlike the original paper.

Requirements

TensorFlow r1.0
numpy
cv2
matplotlib

References

This project uses setting written in muupan's wiki [muuupan/async-rl] (https://github.com/muupan/async-rl/wiki)

Acknowledgements

@aravindsrinivas for providing information for some of the hyper parameters.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 565

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (34) 🔗