Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → dgriff777 → A3c_continuous

dgriff777 / A3c_continuous

Licence: apache-2.0

A continuous action space version of A3C LSTM in pytorch plus A3G design

Programming Languages

139335 projects - #7 most used programming language

Labels

pytorch openai-gym a3c

Projects that are alternatives of or similar to A3c continuous

a3c-super-mario-pytorch

Reinforcement Learning for Super Mario Bros using A3C on GPU

Stars: ✭ 35 (-84.3%)

Mutual labels: openai-gym, a3c

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (-0.45%)

Mutual labels: openai-gym, a3c

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-47.09%)

Mutual labels: openai-gym, a3c

A3C LSTM Atari with Pytorch plus A3G design

Stars: ✭ 482 (+116.14%)

Mutual labels: openai-gym, a3c

Implementations of deep RL papers and random experimentation

Stars: ✭ 176 (-21.08%)

Mutual labels: openai-gym, a3c

deep rl acrobot

TensorFlow A2C to solve Acrobot, with synchronized parallel environments

Stars: ✭ 32 (-85.65%)

Mutual labels: openai-gym, a3c

Combining deep learning and reinforcement learning.

Stars: ✭ 84 (-62.33%)

Mutual labels: openai-gym, a3c

Tensorflow implementation of A3C algorithm

Stars: ✭ 49 (-78.03%)

Mutual labels: openai-gym, a3c

PyTorch implementation of "Asynchronous advantage actor-critic"

Stars: ✭ 21 (-90.58%)

Mutual labels: openai-gym, a3c

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (+243.05%)

Mutual labels: openai-gym, a3c

Drqn Tensorflow

Deep recurrent Q Learning using Tensorflow, openai/gym and openai/retro

Stars: ✭ 127 (-43.05%)

Mutual labels: openai-gym

Forex trading simulator environment for OpenAI Gym, observations contain the order status, performance and timeseries loaded from a CSV file containing rates and indicators. Work In Progress

Stars: ✭ 151 (-32.29%)

Mutual labels: openai-gym

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+819.73%)

Mutual labels: a3c

Stable Baselines

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Stars: ✭ 115 (-48.43%)

Mutual labels: openai-gym

Hierarchical Actor Critic Hac Pytorch

PyTorch implementation of Hierarchical Actor Critic (HAC) for OpenAI gym environments

Stars: ✭ 116 (-47.98%)

Mutual labels: openai-gym

Hands On Intelligent Agents With Openai Gym

Code for Hands On Intelligent Agents with OpenAI Gym book to get started and learn to build deep reinforcement learning agents using PyTorch

Stars: ✭ 189 (-15.25%)

Mutual labels: openai-gym

Ctc Executioner

Master Thesis: Limit order placement with Reinforcement Learning

Stars: ✭ 112 (-49.78%)

Mutual labels: openai-gym

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Stars: ✭ 108 (-51.57%)

Mutual labels: a3c

A universal flight control tuning framework

Stars: ✭ 210 (-5.83%)

Mutual labels: openai-gym

A library for ready-made reinforcement learning agents and reusable components for neat prototyping

Stars: ✭ 184 (-17.49%)

Mutual labels: a3c

View All Similar Projects ➔

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!!

Training with A3G benefits training speed most when using larger models i.e using raw pixels for observations such as training in atari environments that have raw pixels for state representation

RL A3C Pytorch Continuous

This repository includes my implementation with reinforcement learning using Asynchronous Advantage Actor-Critic (A3C) in Pytorch an algorithm from Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

NEWLY ADDED A3G!!

New implementation of A3C that utilizes GPU for speed increase in training. Which we can call A3G. A3G as opposed to other versions that try to utilize GPU with A3C algorithm, with A3G each agent has its own network maintained on GPU but shared model is on CPU and agent models are quickly converted to CPU to update shared model which allows updates to be frequent and fast by utilizing Hogwild Training and make updates to shared model asynchronously and without locks. This new method greatly increase training speed and models and can be see in my rl_a3c_pytorch repo that training that use to take days to train can be trained in as fast as 10minutes for some Atari games!

A3C LSTM

This is continuous domain version of my other a3c repo. Here I show A3C can solve BipedalWalker-v2 but also the much harder BipedalWalkerHardcore-v2 version as well. "Solved" meaning to train a model capable of averaging reward over 300 for 100 consecutive episodes

Added trained model for BipedWalkerHardcore-v2

Requirements

Python 2.7+
Openai Gym
Pytorch
setproctitle

Training

When training model it is important to limit number of worker processes to number of cpu cores available as too many processes (e.g. more than one process per cpu core available) will actually be detrimental in training speed and effectiveness

To train agent in BipedalWalker-v2 environment with 6 different worker processes: On a MacPro 2014 laptop traing typically takes 15-20mins to get to a winning solution

python main.py --workers 6 --env BipedalWalker-v2 --save-max True --model MLP --stack-frames 1

To train agent in BipedalWalkerHardcore-v2 environment with 64 different worker processes: BipedalWalkerHardcore-v2 is much harder environment compared to normal BipedalWalker On a 72 cpu AWS EC2 c5.18xlarge instance training with 64 worker processes takes up to 24hrs to get to model that could solve the environment. Using enhanced A3G design, training model takes only 4-6hrs

python main.py --workers 64 --env BipedalWalkerHardcore-v2 --save-max True --model CONV --stack-frames 4

#A3C-GPU

To train agent in BipedalWalkerHardcore-v2 environment with 32 different worker processes with new A3C-GPU:

python main.py --env BipedalWalkerHardcore-v2 --workers 32 --gpu-ids 0 1 2 3 --amsgrad True --model CONV --stack-frames 4

Hit Ctrl C to end training session properly

Evaluation

To run a 100 episode gym evaluation with trained model

python gym_eval.py --env BipedalWalkerHardcore-v2 --num-episodes 100 --stack-frames 4 --model CONV --new-gym-eval True

Project Reference

README STILL UNDER CONSTRUCTION

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 223

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗