Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → RLOpensource → Impala Distributed Tensorflow

RLOpensource / Impala Distributed Tensorflow

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning

Projects that are alternatives of or similar to Impala Distributed Tensorflow

Rainbow Is All You Need

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Stars: ✭ 938 (+3250%)

Mutual labels: reinforcement-learning

Reinforcement Learning Algorithms

Stars: ✭ 14 (-50%)

Mutual labels: reinforcement-learning

Essential Guide to keep up with AI/ML/DL/CV

Stars: ✭ 913 (+3160.71%)

Mutual labels: reinforcement-learning

Rl Baselines Zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Stars: ✭ 839 (+2896.43%)

Mutual labels: reinforcement-learning

Flappy Bird hack using Reinforcement Learning

Stars: ✭ 876 (+3028.57%)

Mutual labels: reinforcement-learning

Evolutionary Algorithm

Evolutionary Algorithm using Python, 莫烦Python 中文AI教学

Stars: ✭ 881 (+3046.43%)

Mutual labels: reinforcement-learning

The Machine Learning Toybox for testing the behavior of autonomous agents.

Stars: ✭ 25 (-10.71%)

Mutual labels: reinforcement-learning

Seoul AI Gym is a toolkit for developing AI algorithms.

Stars: ✭ 27 (-3.57%)

Mutual labels: reinforcement-learning

Gym Alttp Gridworld

A gym environment for Stuart Armstrong's model of a treacherous turn.

Stars: ✭ 14 (-50%)

Mutual labels: reinforcement-learning

OpenAI Gym environments using DART

Stars: ✭ 20 (-28.57%)

Mutual labels: reinforcement-learning

MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"

Stars: ✭ 9 (-67.86%)

Mutual labels: reinforcement-learning

Reinforcement Learning Assignment: Easy21

Stars: ✭ 11 (-60.71%)

Mutual labels: reinforcement-learning

Actor-Critic Instance Segmentation (CVPR 2019)

Stars: ✭ 15 (-46.43%)

Mutual labels: reinforcement-learning

Simulation of spiking neural networks (SNNs) using PyTorch.

Stars: ✭ 837 (+2889.29%)

Mutual labels: reinforcement-learning

Awesome Ai In Finance

🔬 A curated list of awesome machine learning strategies & tools in financial market.

Stars: ✭ 910 (+3150%)

Mutual labels: reinforcement-learning

Codebase for the Summary Loop paper at ACL2020

Stars: ✭ 26 (-7.14%)

Mutual labels: reinforcement-learning

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+3039.29%)

Mutual labels: reinforcement-learning

Efficient Batched Reinforcement Learning in TensorFlow

Stars: ✭ 945 (+3275%)

Mutual labels: reinforcement-learning

World Models Sonic Pytorch

Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more work needed

Stars: ✭ 27 (-3.57%)

Mutual labels: reinforcement-learning

Udacity Deep Learning Nanodegree

This is just a collection of projects that made during my DEEPLEARNING NANODEGREE by UDACITY

Stars: ✭ 15 (-46.43%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

Implementation of IMPALA with Distributed Tensorflow

Information

These results are from only 32 threads.
A total of 32 CPUs were used, 4 environments were configured for each game type, and a total of 8 games were learned.
Tensorflow Implementation
Use DQN model to inference action
Use distributed tensorflow to implement Actor
Training with 1 day
Same parameter of paper

start learning rate = 0.0006
end learning rate = 0
learning frame = 1e6
gradient clip norm = 40
trajectory = 20
batch size = 32
reward clipping = -1 ~ 1

Dependency

tensorflow==1.14.0
gym[atari]
numpy
tensorboardX
opencv-python

Overall Schema

Model Architecture

How to Run

show start.sh
Learning 8 types of games at a time, one of which uses 4 environments.

Result

Video



Breakout	Pong	Seaquest	Space-Invader

Boxing	Star-Gunner	Kung-Fu	Demon

Plotting

Compare reward clipping method

Video



abs_one	soft_asymmetric

Plotting




abs_one


soft_asymmetric

Is Attention Really Working?

Above Blocks are ignored.
Ball and Bar are attentioned.
Empty space are attentioned because of less trained.

Todo

[x] Only CPU Training method
[x] Distributed tensorflow
[x] Model fix for preventing collapsed
[x] Reward Clipping Experiment
[x] Parameter copying from global learner
[x] Add Relational Reinforcement Learning
[x] Add Action information to Model
[x] Multi Task Learning
[x] Add Recurrent Model
[x] Training on GPU, Inference on CPU

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 28

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗