All Projects → suragnair → Alpha Zero General

suragnair / Alpha Zero General

Licence: mit
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Alpha Zero General

Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (-1.8%)
Mutual labels:  reinforcement-learning, mcts, gomoku, monte-carlo-tree-search, gobang, alphago, alphago-zero, alphazero
alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
Stars: ✭ 68 (-97.4%)
Mutual labels:  mcts, othello, alphago-zero, alpha-zero, alphazero, self-play
alphaFive
alphaGo版本的五子棋(gobang, gomoku)
Stars: ✭ 51 (-98.05%)
Mutual labels:  gomoku, gobang, alphago, alphago-zero, alphazero
AlphaZero Gobang
Deep Learning big homework of UCAS
Stars: ✭ 29 (-98.89%)
Mutual labels:  mcts, gomoku, gobang, alphazero
alphazero
Board Game Reinforcement Learning using AlphaZero method. including Makhos (Thai Checkers), Reversi, Connect Four, Tic-tac-toe game rules
Stars: ✭ 24 (-99.08%)
Mutual labels:  othello, alphago-zero, alphazero
alphastone
Using self-play, MCTS, and a deep neural network to create a hearthstone ai player
Stars: ✭ 24 (-99.08%)
Mutual labels:  monte-carlo-tree-search, alpha-zero, self-play
Chess Alpha Zero
Chess reinforcement learning by AlphaGo Zero methods.
Stars: ✭ 1,868 (-28.62%)
Mutual labels:  jupyter-notebook, reinforcement-learning, alphago-zero
AlphaZero-Renju
No description or website provided.
Stars: ✭ 17 (-99.35%)
Mutual labels:  alphago, alpha-zero, alphazero
UCThello
UCThello - a board game demonstrator (Othello variant) with computer AI using Monte Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds) applied to trees (UCT in short)
Stars: ✭ 26 (-99.01%)
Mutual labels:  mcts, othello, monte-carlo-tree-search
alpha sigma
A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.
Stars: ✭ 134 (-94.88%)
Mutual labels:  gomoku, monte-carlo-tree-search, alphazero
Elf
ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation
Stars: ✭ 3,240 (+23.81%)
Mutual labels:  reinforcement-learning, alphago-zero, alpha-zero
Tensorflow2.0 Examples
🙄 Difficult algorithm, Simple code.
Stars: ✭ 1,397 (-46.62%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Ctc Executioner
Master Thesis: Limit order placement with Reinforcement Learning
Stars: ✭ 112 (-95.72%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Coursera reinforcement learning
Coursera Reinforcement Learning Specialization by University of Alberta & Alberta Machine Intelligence Institute
Stars: ✭ 114 (-95.64%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Pytorch Rl
Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Stars: ✭ 121 (-95.38%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Rlai Exercises
Exercise Solutions for Reinforcement Learning: An Introduction [2nd Edition]
Stars: ✭ 97 (-96.29%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-95.49%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Advanced Deep Learning And Reinforcement Learning Deepmind
🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉
Stars: ✭ 121 (-95.38%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Ngsim env
Learning human driver models from NGSIM data with imitation learning.
Stars: ✭ 96 (-96.33%)
Mutual labels:  jupyter-notebook, reinforcement-learning
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (-95.26%)
Mutual labels:  jupyter-notebook, reinforcement-learning

Alpha Zero General (any game, any framework!)

A simplified, highly flexible, commented and (hopefully) easy to understand implementation of self-play based reinforcement learning based on the AlphaGo Zero paper (Silver et al). It is designed to be easy to adopt for any two-player turn-based adversarial game and any deep learning framework of your choice. A sample implementation has been provided for the game of Othello in PyTorch, Keras, TensorFlow and Chainer. An accompanying tutorial can be found here. We also have implementations for GoBang and TicTacToe.

To use a game of your choice, subclass the classes in Game.py and NeuralNet.py and implement their functions. Example implementations for Othello can be found in othello/OthelloGame.py and othello/{pytorch,keras,tensorflow,chainer}/NNet.py.

Coach.py contains the core training loop and MCTS.py performs the Monte Carlo Tree Search. The parameters for the self-play can be specified in main.py. Additional neural network parameters are in othello/{pytorch,keras,tensorflow,chainer}/NNet.py (cuda flag, batch size, epochs, learning rate etc.).

To start training a model for Othello:

python main.py

Choose your framework and game in main.py.

Docker Installation

For easy environment setup, we can use nvidia-docker. Once you have nvidia-docker set up, we can then simply run:

./setup_env.sh

to set up a (default: pyTorch) Jupyter docker container. We can now open a new terminal and enter:

docker exec -ti pytorch_notebook python main.py

Experiments

We trained a PyTorch model for 6x6 Othello (~80 iterations, 100 episodes per iteration and 25 MCTS simulations per turn). This took about 3 days on an NVIDIA Tesla K80. The pretrained model (PyTorch) can be found in pretrained_models/othello/pytorch/. You can play a game against it using pit.py. Below is the performance of the model against a random and a greedy baseline with the number of iterations. alt tag

A concise description of our algorithm can be found here.

Contributing

While the current code is fairly functional, we could benefit from the following contributions:

  • Game logic files for more games that follow the specifications in Game.py, along with their neural networks
  • Neural networks in other frameworks
  • Pre-trained models for different game configurations
  • An asynchronous version of the code- parallel processes for self-play, neural net training and model comparison.
  • Asynchronous MCTS as described in the paper

Contributors and Credits

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].