Alternatives and detailed information of Alphazero_gomoku

junxiaosong / Alphazero_gomoku

Licence: mit

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Programming Languages

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Alphazero gomoku

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Stars: ✭ 2,617 (+1.83%)

Mutual labels: reinforcement-learning, mcts, gomoku, monte-carlo-tree-search, gobang, alphago, alphago-zero, alphazero

alphaFive

alphaGo版本的五子棋(gobang, gomoku)

Stars: ✭ 51 (-98.02%)

Mutual labels: gomoku, gobang, alphago, alphago-zero, alphazero

AlphaZero Gobang

Deep Learning big homework of UCAS

Stars: ✭ 29 (-98.87%)

Mutual labels: mcts, gomoku, gobang, alphazero

AnimalChess

Animal Fight Chess Game（斗兽棋） written in rust.

Stars: ✭ 76 (-97.04%)

Mutual labels: board-game, monte-carlo-tree-search, alphazero

alpha sigma

A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.

Stars: ✭ 134 (-94.79%)

Mutual labels: gomoku, monte-carlo-tree-search, alphazero

UCThello

UCThello - a board game demonstrator (Othello variant) with computer AI using Monte Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds) applied to trees (UCT in short)

Stars: ✭ 26 (-98.99%)

Mutual labels: board-game, mcts, monte-carlo-tree-search

godpaper

🐵 An AI chess-board-game framework(by many programming languages) implementations.

Stars: ✭ 40 (-98.44%)

Mutual labels: board-game, mcts, alphago

Agentnet

Deep Reinforcement Learning library for humans

Stars: ✭ 298 (-88.4%)

Mutual labels: reinforcement-learning, lasagne, theano

alpha-zero

AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.

Stars: ✭ 68 (-97.35%)

Mutual labels: mcts, alphago-zero, alphazero

Elf

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

Stars: ✭ 3,240 (+26.07%)

Mutual labels: reinforcement-learning, rl, alphago-zero

Practical rl

A course in reinforcement learning in the wild

Stars: ✭ 4,741 (+84.47%)

Mutual labels: reinforcement-learning, lasagne, theano

Deep Learning Python

Intro to Deep Learning, including recurrent, convolution, and feed forward neural networks.

Stars: ✭ 94 (-96.34%)

Mutual labels: lasagne, theano

Repo 2016

R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation

Stars: ✭ 103 (-95.99%)

Mutual labels: lasagne, theano

Aws Robomaker Sample Application Deepracer

Use AWS RoboMaker and demonstrate running a simulation which trains a reinforcement learning (RL) model to drive a car around a track

Stars: ✭ 105 (-95.91%)

Mutual labels: reinforcement-learning, rl

Stable Baselines

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Stars: ✭ 115 (-95.53%)

Mutual labels: reinforcement-learning, rl

Rlenv.directory

Explore and find reinforcement learning environments in a list of 150+ open source environments.

Stars: ✭ 79 (-96.93%)

Mutual labels: reinforcement-learning, rl

Psgan

Periodic Spatial Generative Adversarial Networks

Stars: ✭ 108 (-95.8%)

Mutual labels: lasagne, theano

Rl trading

An environment to high-frequency trading agents under reinforcement learning

Stars: ✭ 205 (-92.02%)

Mutual labels: reinforcement-learning, rl

Pytorch Rl

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (-95.29%)

Mutual labels: reinforcement-learning, rl

Reinforcement learning

Implementation of selected reinforcement learning algorithms in Tensorflow. A3C, DDPG, REINFORCE, DQN, etc.

Stars: ✭ 132 (-94.86%)

Mutual labels: reinforcement-learning, rl

View All Similar Projects ➔

AlphaZero-Gomoku

This is an implementation of the AlphaZero algorithm for playing the simple board game Gomoku (also called Gobang or Five in a Row) from pure self-play training. The game Gomoku is much simpler than Go or chess, so that we can focus on the training scheme of AlphaZero and obtain a pretty good AI model on a single PC in a few hours.

References:

AlphaZero: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
AlphaGo Zero: Mastering the game of Go without human knowledge

Update 2018.2.24: supports training with TensorFlow!

Update 2018.1.17: supports training with PyTorch!

Example Games Between Trained Models

Each move with 400 MCTS playouts:

Requirements

To play with the trained AI models, only need:

Python >= 2.7
Numpy >= 1.11

To train the AI model from scratch, further need, either:

Theano >= 0.7 and Lasagne >= 0.1
or
PyTorch >= 0.2.0
or
TensorFlow

PS: if your Theano's version > 0.7, please follow this issue to install Lasagne,
otherwise, force pip to downgrade Theano to 0.7 pip install --upgrade theano==0.7.0

If you would like to train the model using other DL frameworks, you only need to rewrite policy_value_net.py.

Getting Started

To play with provided models, run the following script from the directory:

python human_play.py

You may modify human_play.py to try different provided models or the pure MCTS.

To train the AI model from scratch, with Theano and Lasagne, directly run:

python train.py

With PyTorch or TensorFlow, first modify the file train.py, i.e., comment the line

from policy_value_net import PolicyValueNet  # Theano and Lasagne

and uncomment the line

# from policy_value_net_pytorch import PolicyValueNet  # Pytorch
or
# from policy_value_net_tensorflow import PolicyValueNet # Tensorflow

and then execute: python train.py (To use GPU in PyTorch, set use_gpu=True and use return loss.item(), entropy.item() in function train_step in policy_value_net_pytorch.py if your pytorch version is greater than 0.5)

The models (best_policy.model and current_policy.model) will be saved every a few updates (default 50).

Note: the 4 provided models were trained using Theano/Lasagne, to use them with PyTorch, please refer to issue 5.

Tips for training:

It is good to start with a 6 * 6 board and 4 in a row. For this case, we may obtain a reasonably good model within 500~1000 self-play games in about 2 hours.
For the case of 8 * 8 board and 5 in a row, it may need 2000~3000 self-play games to get a good model, and it may take about 2 days on a single PC.

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

junxiaosong / Alphazero_gomoku

Programming Languages

Labels

Projects that are alternatives of or similar to Alphazero gomoku

AlphaZero-Gomoku

Update 2018.2.24: supports training with TensorFlow!

Update 2018.1.17: supports training with PyTorch!

Example Games Between Trained Models

Requirements

Getting Started

Further reading