All Projects → RLOpensource → Impala Distributed Tensorflow

RLOpensource / Impala Distributed Tensorflow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Impala Distributed Tensorflow

Rainbow Is All You Need
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
Stars: ✭ 938 (+3250%)
Mutual labels:  reinforcement-learning
Rl algos
Reinforcement Learning Algorithms
Stars: ✭ 14 (-50%)
Mutual labels:  reinforcement-learning
Doyouevenlearn
Essential Guide to keep up with AI/ML/DL/CV
Stars: ✭ 913 (+3160.71%)
Mutual labels:  reinforcement-learning
Rl Baselines Zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Stars: ✭ 839 (+2896.43%)
Mutual labels:  reinforcement-learning
Flappybirdrl
Flappy Bird hack using Reinforcement Learning
Stars: ✭ 876 (+3028.57%)
Mutual labels:  reinforcement-learning
Evolutionary Algorithm
Evolutionary Algorithm using Python, 莫烦Python 中文AI教学
Stars: ✭ 881 (+3046.43%)
Mutual labels:  reinforcement-learning
Toybox
The Machine Learning Toybox for testing the behavior of autonomous agents.
Stars: ✭ 25 (-10.71%)
Mutual labels:  reinforcement-learning
Gym
Seoul AI Gym is a toolkit for developing AI algorithms.
Stars: ✭ 27 (-3.57%)
Mutual labels:  reinforcement-learning
Gym Alttp Gridworld
A gym environment for Stuart Armstrong's model of a treacherous turn.
Stars: ✭ 14 (-50%)
Mutual labels:  reinforcement-learning
Gym Dart
OpenAI Gym environments using DART
Stars: ✭ 20 (-28.57%)
Mutual labels:  reinforcement-learning
A3c
MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"
Stars: ✭ 9 (-67.86%)
Mutual labels:  reinforcement-learning
Easy21
Reinforcement Learning Assignment: Easy21
Stars: ✭ 11 (-60.71%)
Mutual labels:  reinforcement-learning
Acis
Actor-Critic Instance Segmentation (CVPR 2019)
Stars: ✭ 15 (-46.43%)
Mutual labels:  reinforcement-learning
Bindsnet
Simulation of spiking neural networks (SNNs) using PyTorch.
Stars: ✭ 837 (+2889.29%)
Mutual labels:  reinforcement-learning
Awesome Ai In Finance
🔬 A curated list of awesome machine learning strategies & tools in financial market.
Stars: ✭ 910 (+3150%)
Mutual labels:  reinforcement-learning
Summary loop
Codebase for the Summary Loop paper at ACL2020
Stars: ✭ 26 (-7.14%)
Mutual labels:  reinforcement-learning
Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+3039.29%)
Mutual labels:  reinforcement-learning
Batch Ppo
Efficient Batched Reinforcement Learning in TensorFlow
Stars: ✭ 945 (+3275%)
Mutual labels:  reinforcement-learning
World Models Sonic Pytorch
Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more work needed
Stars: ✭ 27 (-3.57%)
Mutual labels:  reinforcement-learning
Udacity Deep Learning Nanodegree
This is just a collection of projects that made during my DEEPLEARNING NANODEGREE by UDACITY
Stars: ✭ 15 (-46.43%)
Mutual labels:  reinforcement-learning

Implementation of IMPALA with Distributed Tensorflow

Information

  • These results are from only 32 threads.
  • A total of 32 CPUs were used, 4 environments were configured for each game type, and a total of 8 games were learned.
  • Tensorflow Implementation
  • Use DQN model to inference action
  • Use distributed tensorflow to implement Actor
  • Training with 1 day
  • Same parameter of paper
start learning rate = 0.0006
end learning rate = 0
learning frame = 1e6
gradient clip norm = 40
trajectory = 20
batch size = 32
reward clipping = -1 ~ 1

Dependency

tensorflow==1.14.0
gym[atari]
numpy
tensorboardX
opencv-python

Overall Schema

Model Architecture

How to Run

  • show start.sh
  • Learning 8 types of games at a time, one of which uses 4 environments.

Result

Video

Breakout Pong Seaquest Space-Invader
Breakout Pong Seaquest Space-Invader
Boxing Star-Gunner KungFu Demon
Boxing Star-Gunner Kung-Fu Demon

Plotting

abs_one abs_one

Compare reward clipping method

Video

abs_one Pong
abs_one soft_asymmetric

Plotting

abs_one
abs_one
abs_one
soft_asymmetric
soft_asymmetric
soft_asymmetric

Is Attention Really Working?

abs_one
  • Above Blocks are ignored.
  • Ball and Bar are attentioned.
  • Empty space are attentioned because of less trained.

Todo

  • [x] Only CPU Training method
  • [x] Distributed tensorflow
  • [x] Model fix for preventing collapsed
  • [x] Reward Clipping Experiment
  • [x] Parameter copying from global learner
  • [x] Add Relational Reinforcement Learning
  • [x] Add Action information to Model
  • [x] Multi Task Learning
  • [x] Add Recurrent Model
  • [x] Training on GPU, Inference on CPU

Reference

  1. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
  2. deepmind/scalable_agent
  3. Asynchronous_Advatnage_Actor_Critic
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].