Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

ikostrikov / Pytorch A3c

Licence: mit

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning pytorch reinforcement-learning deep-reinforcement-learning actor-critic a3c

Projects that are alternatives of or similar to Pytorch A3c

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+225.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Pytorch Rl

Deep Reinforcement Learning with pytorch & visdom

Stars: ✭ 745 (-15.24%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Reinforcement learning tutorial with demo

Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (-49.72%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Torch Ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Stars: ✭ 70 (-92.04%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Rl a3c pytorch

A3C LSTM Atari with Pytorch plus A3G design

Stars: ✭ 482 (-45.16%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-86.58%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Pytorch Drl

PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.

Stars: ✭ 233 (-73.49%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (-74.74%)

Mutual labels: deep-reinforcement-learning, a3c, actor-critic

Reinforcement Learning With Tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Stars: ✭ 6,948 (+690.44%)

Mutual labels: reinforcement-learning, actor-critic, a3c

Btgym

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (-12.97%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, a3c

Pytorch sac

PyTorch implementation of Soft Actor-Critic (SAC)

Stars: ✭ 174 (-80.2%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Master-Thesis

Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex

Stars: ✭ 33 (-96.25%)

Mutual labels: deep-reinforcement-learning, a3c, actor-critic

Openai lab

An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.

Stars: ✭ 313 (-64.39%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Tensorflow Reinforce

Implementations of Reinforcement Learning Models in Tensorflow

Stars: ✭ 480 (-45.39%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Deeprl Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Stars: ✭ 748 (-14.9%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Pytorch A2c Ppo Acktr Gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+199.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Drq

DrQ: Data regularized Q

Stars: ✭ 268 (-69.51%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+2.84%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, a3c

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+133.33%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, a3c

Deeprl Tensorflow2

🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2

Stars: ✭ 319 (-63.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, a3c

View All Similar Projects ➔

pytorch-a3c

This is a PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

This implementation is inspired by Universe Starter Agent. In contrast to the starter agent, it uses an optimizer with shared statistics as in the original paper.

Please use this bibtex if you want to cite this repository in your publications:

@misc{pytorchaaac,
  author = {Kostrikov, Ilya},
  title = {PyTorch Implementations of Asynchronous Advantage Actor Critic},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/ikostrikov/pytorch-a3c}},
}

A2C

I highly recommend to check a sychronous version and other algorithms: pytorch-a2c-ppo-acktr.

In my experience, A2C works better than A3C and ACKTR is better than both of them. Moreover, PPO is a great algorithm for continuous control. Thus, I recommend to try A2C/PPO/ACKTR first and use A3C only if you need it specifically for some reasons.

Also read OpenAI blog for more information.

Contributions

Contributions are very welcome. If you know how to make this code better, don't hesitate to send a pull request.

Usage

# Works only wih Python 3.
python3 main.py --env-name "PongDeterministic-v4" --num-processes 16

This code runs evaluation in a separate thread in addition to 16 processes.

Results

With 16 processes it converges for PongDeterministic-v4 in 15 minutes.

For BreakoutDeterministic-v4 it takes more than several hours.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 879

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (18) 🔗