Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-67.58%)

Mutual labels: actor-critic, a3c

a3c-super-mario-pytorch

Reinforcement Learning for Super Mario Bros using A3C on GPU

Stars: ✭ 35 (-90.38%)

Mutual labels: multiprocessing, a3c

Baby A3c

A high-performance Atari A3C agent in 180 lines of PyTorch

Stars: ✭ 144 (-60.44%)

Mutual labels: actor-critic, a3c

Pytorch sac ae

PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)

Stars: ✭ 94 (-74.18%)

Mutual labels: gym, actor-critic

Reinforcement Learning With Tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Stars: ✭ 6,948 (+1808.79%)

Mutual labels: actor-critic, a3c

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (-39.01%)

Mutual labels: a3c, actor-critic

reinforcement learning with Tensorflow

Minimal implementations of reinforcement learning algorithms by Tensorflow

Stars: ✭ 28 (-92.31%)

Mutual labels: a3c, actor-critic

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+269.51%)

Mutual labels: actor-critic, a3c

Explorer

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Stars: ✭ 54 (-85.16%)

Mutual labels: gym, actor-critic

Torch Ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Stars: ✭ 70 (-80.77%)

Mutual labels: actor-critic, a3c

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+686.54%)

Mutual labels: actor-critic, a3c

Pytorch A3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+141.48%)

Mutual labels: actor-critic, a3c

Rl algos

Reinforcement Learning Algorithms

Stars: ✭ 14 (-96.15%)

Mutual labels: gym, actor-critic

Rl a3c pytorch

A3C LSTM Atari with Pytorch plus A3G design

Stars: ✭ 482 (+32.42%)

Mutual labels: actor-critic, a3c

Pytorch Rl

Deep Reinforcement Learning with pytorch & visdom

Stars: ✭ 745 (+104.67%)

Mutual labels: actor-critic, a3c

Pytorch sac

PyTorch implementation of Soft Actor-Critic (SAC)

Stars: ✭ 174 (-52.2%)

Mutual labels: gym, actor-critic

Master-Thesis

Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex

Stars: ✭ 33 (-90.93%)

Mutual labels: a3c, actor-critic

View All Similar Projects ➔

Simple implementation of Reinforcement Learning (A3C) using Pytorch

This is a toy example of using multiprocessing in Python to asynchronously train a neural network to play discrete action CartPole and continuous action Pendulum games. The asynchronous algorithm I used is called Asynchronous Advantage Actor-Critic or A3C.

I believe it would be the simplest toy implementation you can find at the moment (2018-01).

What are the main focuses in this implementation?

Pytorch + multiprocessing (NOT threading) for parallel training
Both discrete and continuous action environments
To be simple and easy to dig into the code (less than 200 lines)

Reason of using Pytorch instead of Tensorflow

Both of them are great for building your customized neural network. But to work with multiprocessing, Tensorflow is not that great due to its low compatibility with multiprocessing. I have an implementation of Tensorflow A3C build on threading. I even tried to implement distributed Tensorflow. However, the distributed version is for cluster computing which I don't have. When using only one machine, it is slower than threading version I wrote.

Fortunately, Pytorch gets the multiprocessing compatibility. I went through many Pytorch A3C examples (there, there and there). They are great but too complicated to dig into the code. Therefore, this is my motivation to write my simple example codes.

BTW, if you are interested to learn Pytorch, there is my simple tutorial code with many visualizations. I also made the tensorflow tutorial (same as pytorch) available in here.

Codes & Results

shared_adam.py: optimizer that shares its parameters in parallel
utils.py: useful function that can be used more than once
discrete_A3C.py: CartPole, neural net and training for discrete action space
continuous_A3C.py: Pendulum, neural net and training for continuous action space

CartPole result

Pendulum result

Dependencies

pytorch >= 0.4.0
numpy
gym
matplotlib

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 364

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (10) 🔗