All Projects → koulanurag → Muzero Pytorch

koulanurag / Muzero Pytorch

Licence: mit
Pytorch Implementation of MuZero

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Muzero Pytorch

Torchrl
Highly Modular and Scalable Reinforcement Learning
Stars: ✭ 102 (-20.93%)
Mutual labels:  deep-reinforcement-learning
Tetris Ai
A deep reinforcement learning bot that plays tetris
Stars: ✭ 109 (-15.5%)
Mutual labels:  deep-reinforcement-learning
Advanced Deep Learning And Reinforcement Learning Deepmind
🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉
Stars: ✭ 121 (-6.2%)
Mutual labels:  deep-reinforcement-learning
Reinforcement Learning
🤖 Implements of Reinforcement Learning algorithms.
Stars: ✭ 104 (-19.38%)
Mutual labels:  deep-reinforcement-learning
Deeptraffic
DeepTraffic is a deep reinforcement learning competition, part of the MIT Deep Learning series.
Stars: ✭ 1,528 (+1084.5%)
Mutual labels:  deep-reinforcement-learning
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-8.53%)
Mutual labels:  deep-reinforcement-learning
Top Deep Learning
Top 200 deep learning Github repositories sorted by the number of stars.
Stars: ✭ 1,365 (+958.14%)
Mutual labels:  deep-reinforcement-learning
Pytorch Trpo
PyTorch Implementation of Trust Region Policy Optimization (TRPO)
Stars: ✭ 123 (-4.65%)
Mutual labels:  deep-reinforcement-learning
A3c Pytorch
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch
Stars: ✭ 108 (-16.28%)
Mutual labels:  deep-reinforcement-learning
Deep Rl Tensorflow
TensorFlow implementation of Deep Reinforcement Learning papers
Stars: ✭ 1,552 (+1103.1%)
Mutual labels:  deep-reinforcement-learning
Macad Gym
Multi-Agent Connected Autonomous Driving (MACAD) Gym environments for Deep RL. Code for the paper presented in the Machine Learning for Autonomous Driving Workshop at NeurIPS 2019:
Stars: ✭ 106 (-17.83%)
Mutual labels:  deep-reinforcement-learning
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+2228.68%)
Mutual labels:  deep-reinforcement-learning
Deep reinforcement learning
Resources, papers, tutorials
Stars: ✭ 119 (-7.75%)
Mutual labels:  deep-reinforcement-learning
Intro To Deep Learning
A collection of materials to help you learn about deep learning
Stars: ✭ 103 (-20.16%)
Mutual labels:  deep-reinforcement-learning
Rl Medical
Deep Reinforcement Learning (DRL) agents applied to medical images
Stars: ✭ 123 (-4.65%)
Mutual labels:  deep-reinforcement-learning
Deep Reinforcement Learning Notes
Deep Reinforcement Learning Notes
Stars: ✭ 101 (-21.71%)
Mutual labels:  deep-reinforcement-learning
Hierarchical Actor Critic Hac Pytorch
PyTorch implementation of Hierarchical Actor Critic (HAC) for OpenAI gym environments
Stars: ✭ 116 (-10.08%)
Mutual labels:  deep-reinforcement-learning
A Deep Rl Approach For Sdn Routing Optimization
A Deep-Reinforcement Learning Approach for Software-Defined Networking Routing Optimization
Stars: ✭ 125 (-3.1%)
Mutual labels:  deep-reinforcement-learning
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (-3.88%)
Mutual labels:  deep-reinforcement-learning
Drl Portfolio Management
CSCI 599 deep learning and its applications final project
Stars: ✭ 121 (-6.2%)
Mutual labels:  deep-reinforcement-learning

muzero-pytorch

Pytorch Implementation of MuZero : "Mastering Atari , Go, Chess and Shogi by Planning with a Learned Model" based on pseudo-code provided by the authors

Note: This implementation has just been tested on CartPole-v1 and would required modifications(in config folder) for other environments

Installation

  • Python 3.6, 3.7
  •   cd muzero-pytorch
      pip install -r requirements.txt
    

Usage:

  • Train: python main.py --env CartPole-v1 --case classic_control --opr train --force
  • Test: python main.py --env CartPole-v1 --case classic_control --opr test
  • Visualize results : tensorboard --logdir=<result_dir_path>
Required Arguments Description
--env Name of the environment
--case {atari,classic_control,box2d} It's used for switching between different domains(default: None)
--opr {train,test} select the operation to be performed
Optional Arguments Description
--value_loss_coeff Scale for value loss (default: None)
--revisit_policy_search_rate Rate at which target policy is re-estimated (default:None)( only valid if --use_target_model is enabled)
--use_priority Uses priority for data sampling in replay buffer. Also, priority for new data is calculated based on loss (default: False)
--use_max_priority Forces max priority assignment for new incoming data in replay buffer (only valid if --use_priority is enabled) (default: False)
--use_target_model Use target model for bootstrap value estimation (default: False)
--result_dir Directory Path to store results (defaut: current working directory)
--no_cuda no cuda usage (default: False)
--debug If enables, logs additional values (default:False)
--render Renders the environment (default: False)
--force Overrides past results (default: False)
--seed seed (default: 0)
--test_episodes Evaluation episode count (default: 10)

Note: default: None => Values are loaded from the corresponding config

Training

CartPole-v1

  • Curves represents model evaluation for 5 episodes at 100 step training interval.
  • Also, each curve is a mean scores over 5 runs (seeds : [0,100,200,300,400])
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].