RLOpensource / Impala Distributed Tensorflow
Stars: ✭ 28
Programming Languages
python
139335 projects - #7 most used programming language
Labels
Projects that are alternatives of or similar to Impala Distributed Tensorflow
Rainbow Is All You Need
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
Stars: ✭ 938 (+3250%)
Mutual labels: reinforcement-learning
Doyouevenlearn
Essential Guide to keep up with AI/ML/DL/CV
Stars: ✭ 913 (+3160.71%)
Mutual labels: reinforcement-learning
Rl Baselines Zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Stars: ✭ 839 (+2896.43%)
Mutual labels: reinforcement-learning
Flappybirdrl
Flappy Bird hack using Reinforcement Learning
Stars: ✭ 876 (+3028.57%)
Mutual labels: reinforcement-learning
Evolutionary Algorithm
Evolutionary Algorithm using Python, 莫烦Python 中文AI教学
Stars: ✭ 881 (+3046.43%)
Mutual labels: reinforcement-learning
Toybox
The Machine Learning Toybox for testing the behavior of autonomous agents.
Stars: ✭ 25 (-10.71%)
Mutual labels: reinforcement-learning
Gym
Seoul AI Gym is a toolkit for developing AI algorithms.
Stars: ✭ 27 (-3.57%)
Mutual labels: reinforcement-learning
Gym Alttp Gridworld
A gym environment for Stuart Armstrong's model of a treacherous turn.
Stars: ✭ 14 (-50%)
Mutual labels: reinforcement-learning
Gym Dart
OpenAI Gym environments using DART
Stars: ✭ 20 (-28.57%)
Mutual labels: reinforcement-learning
A3c
MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"
Stars: ✭ 9 (-67.86%)
Mutual labels: reinforcement-learning
Easy21
Reinforcement Learning Assignment: Easy21
Stars: ✭ 11 (-60.71%)
Mutual labels: reinforcement-learning
Acis
Actor-Critic Instance Segmentation (CVPR 2019)
Stars: ✭ 15 (-46.43%)
Mutual labels: reinforcement-learning
Bindsnet
Simulation of spiking neural networks (SNNs) using PyTorch.
Stars: ✭ 837 (+2889.29%)
Mutual labels: reinforcement-learning
Awesome Ai In Finance
🔬 A curated list of awesome machine learning strategies & tools in financial market.
Stars: ✭ 910 (+3150%)
Mutual labels: reinforcement-learning
Summary loop
Codebase for the Summary Loop paper at ACL2020
Stars: ✭ 26 (-7.14%)
Mutual labels: reinforcement-learning
Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+3039.29%)
Mutual labels: reinforcement-learning
Batch Ppo
Efficient Batched Reinforcement Learning in TensorFlow
Stars: ✭ 945 (+3275%)
Mutual labels: reinforcement-learning
World Models Sonic Pytorch
Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more work needed
Stars: ✭ 27 (-3.57%)
Mutual labels: reinforcement-learning
Udacity Deep Learning Nanodegree
This is just a collection of projects that made during my DEEPLEARNING NANODEGREE by UDACITY
Stars: ✭ 15 (-46.43%)
Mutual labels: reinforcement-learning
Implementation of IMPALA with Distributed Tensorflow
Information
- These results are from only 32 threads.
- A total of 32 CPUs were used, 4 environments were configured for each game type, and a total of 8 games were learned.
- Tensorflow Implementation
- Use DQN model to inference action
- Use distributed tensorflow to implement Actor
- Training with 1 day
- Same parameter of paper
start learning rate = 0.0006
end learning rate = 0
learning frame = 1e6
gradient clip norm = 40
trajectory = 20
batch size = 32
reward clipping = -1 ~ 1
Dependency
tensorflow==1.14.0
gym[atari]
numpy
tensorboardX
opencv-python
Overall Schema
Model Architecture
How to Run
- show start.sh
- Learning 8 types of games at a time, one of which uses 4 environments.
Result
Video
Breakout | Pong | Seaquest | Space-Invader |
Boxing | Star-Gunner | Kung-Fu | Demon |
Plotting
Compare reward clipping method
Video
abs_one | soft_asymmetric |
Plotting
abs_one |
soft_asymmetric |
Is Attention Really Working?
- Above Blocks are ignored.
- Ball and Bar are attentioned.
- Empty space are attentioned because of less trained.
Todo
- [x] Only CPU Training method
- [x] Distributed tensorflow
- [x] Model fix for preventing collapsed
- [x] Reward Clipping Experiment
- [x] Parameter copying from global learner
- [x] Add Relational Reinforcement Learning
- [x] Add Action information to Model
- [x] Multi Task Learning
- [x] Add Recurrent Model
- [x] Training on GPU, Inference on CPU
Reference
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].