All Projects → MorvanZhou → Pytorch A3c

MorvanZhou / Pytorch A3c

Licence: mit
Simple A3C implementation with pytorch + multiprocessing

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch A3c

Deep Rl Keras
Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)
Stars: ✭ 395 (+8.52%)
Mutual labels:  gym, a3c
A2c
A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow
Stars: ✭ 169 (-53.57%)
Mutual labels:  gym, actor-critic
Super Mario Bros A3c Pytorch
Asynchronous Advantage Actor-Critic (A3C) algorithm for Super Mario Bros
Stars: ✭ 775 (+112.91%)
Mutual labels:  gym, a3c
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-67.58%)
Mutual labels:  actor-critic, a3c
a3c-super-mario-pytorch
Reinforcement Learning for Super Mario Bros using A3C on GPU
Stars: ✭ 35 (-90.38%)
Mutual labels:  multiprocessing, a3c
Baby A3c
A high-performance Atari A3C agent in 180 lines of PyTorch
Stars: ✭ 144 (-60.44%)
Mutual labels:  actor-critic, a3c
Pytorch sac ae
PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)
Stars: ✭ 94 (-74.18%)
Mutual labels:  gym, actor-critic
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+1808.79%)
Mutual labels:  actor-critic, a3c
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (-39.01%)
Mutual labels:  a3c, actor-critic
reinforcement learning with Tensorflow
Minimal implementations of reinforcement learning algorithms by Tensorflow
Stars: ✭ 28 (-92.31%)
Mutual labels:  a3c, actor-critic
Deep Reinforcement Learning With Pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Stars: ✭ 1,345 (+269.51%)
Mutual labels:  actor-critic, a3c
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (-85.16%)
Mutual labels:  gym, actor-critic
Torch Ac
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Stars: ✭ 70 (-80.77%)
Mutual labels:  actor-critic, a3c
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+686.54%)
Mutual labels:  actor-critic, a3c
Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+141.48%)
Mutual labels:  actor-critic, a3c
Rl algos
Reinforcement Learning Algorithms
Stars: ✭ 14 (-96.15%)
Mutual labels:  gym, actor-critic
Rl a3c pytorch
A3C LSTM Atari with Pytorch plus A3G design
Stars: ✭ 482 (+32.42%)
Mutual labels:  actor-critic, a3c
Pytorch Rl
Deep Reinforcement Learning with pytorch & visdom
Stars: ✭ 745 (+104.67%)
Mutual labels:  actor-critic, a3c
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (-52.2%)
Mutual labels:  gym, actor-critic
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-90.93%)
Mutual labels:  a3c, actor-critic

Simple implementation of Reinforcement Learning (A3C) using Pytorch

This is a toy example of using multiprocessing in Python to asynchronously train a neural network to play discrete action CartPole and continuous action Pendulum games. The asynchronous algorithm I used is called Asynchronous Advantage Actor-Critic or A3C.

I believe it would be the simplest toy implementation you can find at the moment (2018-01).

What are the main focuses in this implementation?

  • Pytorch + multiprocessing (NOT threading) for parallel training
  • Both discrete and continuous action environments
  • To be simple and easy to dig into the code (less than 200 lines)

Reason of using Pytorch instead of Tensorflow

Both of them are great for building your customized neural network. But to work with multiprocessing, Tensorflow is not that great due to its low compatibility with multiprocessing. I have an implementation of Tensorflow A3C build on threading. I even tried to implement distributed Tensorflow. However, the distributed version is for cluster computing which I don't have. When using only one machine, it is slower than threading version I wrote.

Fortunately, Pytorch gets the multiprocessing compatibility. I went through many Pytorch A3C examples (there, there and there). They are great but too complicated to dig into the code. Therefore, this is my motivation to write my simple example codes.

BTW, if you are interested to learn Pytorch, there is my simple tutorial code with many visualizations. I also made the tensorflow tutorial (same as pytorch) available in here.

Codes & Results

  • shared_adam.py: optimizer that shares its parameters in parallel
  • utils.py: useful function that can be used more than once
  • discrete_A3C.py: CartPole, neural net and training for discrete action space
  • continuous_A3C.py: Pendulum, neural net and training for continuous action space

CartPole result cartpole

Pendulum result pendulum

Dependencies

  • pytorch >= 0.4.0
  • numpy
  • gym
  • matplotlib
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].