Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-32.95%)

Mutual labels: reinforcement-learning, openai-gym, a3c

Hands On Reinforcement Learning With Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

Stars: ✭ 640 (+263.64%)

Mutual labels: reinforcement-learning, openai-gym, trpo

Gym Anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)

Stars: ✭ 627 (+256.25%)

Mutual labels: reinforcement-learning, dqn, openai-gym

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+1065.34%)

Mutual labels: reinforcement-learning, dqn, a3c

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+413.64%)

Mutual labels: reinforcement-learning, dqn, a3c

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+664.2%)

Mutual labels: dqn, a3c, trpo

Openaigym

Solving OpenAI Gym problems.

Stars: ✭ 98 (-44.32%)

Mutual labels: reinforcement-learning, dqn, openai-gym

Deep Rl Keras

Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)

Stars: ✭ 395 (+124.43%)

Mutual labels: reinforcement-learning, dqn, a3c

Cartpole

OpenAI's cartpole env solver.

Stars: ✭ 107 (-39.2%)

Mutual labels: reinforcement-learning, dqn, openai-gym

View All Similar Projects ➔

Tensorflow-RL

Tensorflow based implementations of A3C, PGQ, TRPO, DQN+CTS, and CEM originally based on the A3C implementation from https://github.com/traai/async-deep-rl. I extensively refactored most of the code and beyond the new algorithms added several additional options including the a3c-lstm architecture, a fully-connected architecture to allow training on non-image-based gym environments, and support for continuous action spaces.

The code also includes some experimental ideas I'm toying with and I'm planning on adding the following implementations in the near future:

*currently in progress

Notes

You can find a number of my evaluations for the A3C, TRPO, and DQN+CTS algorithms at https://gym.openai.com/users/steveKapturowski. As I'm working on lots of refactoring at the moment it's possible I could break things. Please open an issue if you discover any bugs.
I'm in the process of swapping out most of the multiprocessing code in favour of distributed tensorflow which should simplify a lot of the training code and allow to distribute actor-learner processes across multiple machines.
There's also an implementation of the A3C+ model from Unifying Count-Based Exploration and Intrinsic Motivation but I've been focusing on improvements to the DQN variant so this hasn't gotten much love

Running the code

First you'll need to install the cython extensions needed for the hog updates and CTS density model:

./setup.py install build_ext --inplace

To train an a3c agent on Pong run:

python main.py Pong-v0 --alg_type a3c -n 8

To evaluate a trained agent simply add the --test flag:

python main.py Pong-v0 --alg_type a3c -n 1 --test --restore_checkpoint

DQN+CTS after 80M agent steps using 16 actor-learner threads

A3C run on Pong-v0 with default parameters and frameskip sampled uniformly over 3-4

Requirements

python 2.7
tensorflow 1.2
scikit-image
Cython
pyaml
gym

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 176

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (9) 🔗