All Projects → Akababa → Chess-Zero

Akababa / Chess-Zero

Licence: MIT License
Chess reinforcement learning by AlphaZero methods.

Programming Languages

python
139335 projects - #7 most used programming language
Batchfile
5799 projects

Projects that are alternatives of or similar to Chess-Zero

Zerofish
An implementation of the AlphaZero algorithm for chess
Stars: ✭ 34 (-5.56%)
Mutual labels:  chess, alphazero
AnimalChess
Animal Fight Chess Game(斗兽棋) written in rust.
Stars: ✭ 76 (+111.11%)
Mutual labels:  chess, alphazero
computer-go-dataset
datasets for computer go
Stars: ✭ 133 (+269.44%)
Mutual labels:  alphazero
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-52.78%)
Mutual labels:  supervised-learning
liground
A free, open-source and modern Chess Variant Analysis GUI for the 21st century
Stars: ✭ 41 (+13.89%)
Mutual labels:  chess
blitz-tactics
Fast-paced chess tactics trainer
Stars: ✭ 137 (+280.56%)
Mutual labels:  chess
robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Stars: ✭ 34 (-5.56%)
Mutual labels:  supervised-learning
dl-relu
Deep Learning using Rectified Linear Units (ReLU)
Stars: ✭ 20 (-44.44%)
Mutual labels:  supervised-learning
Hand-Gesture-Recognition-Using-Background-Elllimination-and-Convolution-Neural-Network
Hand Gesture Recognition using Convolution Neural Network built using Tensorflow, OpenCV and python
Stars: ✭ 120 (+233.33%)
Mutual labels:  supervised-learning
Play-online-chess-with-real-chess-board
Program that enables you to play online chess using real chess board.
Stars: ✭ 288 (+700%)
Mutual labels:  chess
textlytics
Text processing library for sentiment analysis and related tasks
Stars: ✭ 25 (-30.56%)
Mutual labels:  supervised-learning
should-i-play-f6
Chess project to analyze the statistical effect of playing f3 (as white) or f6 (as black) on the outcome of the game.
Stars: ✭ 15 (-58.33%)
Mutual labels:  chess
Minic
A simple chess engine to learn and play with
Stars: ✭ 65 (+80.56%)
Mutual labels:  chess
chess
Chess (game)(♟) built in C# and ASCII art.
Stars: ✭ 20 (-44.44%)
Mutual labels:  chess
zoofs
zoofs is a python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.
Stars: ✭ 142 (+294.44%)
Mutual labels:  supervised-learning
rust-pgn-reader
Fast non-allocating and streaming reader for chess games in PGN notation
Stars: ✭ 46 (+27.78%)
Mutual labels:  chess
lila-gif
Webservice to render Gifs of chess positions and games, and stream them frame by frame
Stars: ✭ 63 (+75%)
Mutual labels:  chess
gochess
Online real time chess web server using websockets
Stars: ✭ 32 (-11.11%)
Mutual labels:  chess
first-neural-network
Simple neural network implemented from scratch in C++.
Stars: ✭ 17 (-52.78%)
Mutual labels:  supervised-learning
chessground
Chessground React Wrapper
Stars: ✭ 15 (-58.33%)
Mutual labels:  chess

About

Chess reinforcement learning by AlphaGo Zero methods.

This project is based on these main resources:

  1. DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.
  2. The great Reversi development of the DeepMind ideas that @mokemokechicken did in his repo: https://github.com/mokemokechicken/reversi-alpha-zero
  3. DeepMind just released a new version of AlphaGo Zero (named now AlphaZero) where they master chess from scratch: https://arxiv.org/pdf/1712.01815.pdf. In fact, in chess AlphaZero outperformed Stockfish after just 4 hours (300k steps) Wow!

See the wiki for more details.

Note: This project is still under construction!!

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Results so far

Using supervised learning on about 10k games, I trained a model (7 residual blocks of 256 filters) to a guesstimate of 1200 elo with 1200 sims/move. One of the strengths of MCTS is it scales quite well with computing power.

Here you can see an example of a game I (white, ~2000 elo) played against the model in this repo (black):

img

Modules

Supervised Learning

I've done a supervised learning new pipeline step (to use those human games files "PGN" we can find in internet as play-data generator). This SL step was also used in the first and original version of AlphaGo and maybe chess is a some complex game that we have to pre-train first the policy model before starting the self-play process (i.e., maybe chess is too much complicated for a self training alone).

To use the new SL process is as simple as running in the beginning instead of the worker "self" the new worker "sl". Once the model converges enough with SL play-data we just stop the worker "sl" and start the worker "self" so the model will start improving now due to self-play data.

python src/chess_zero/run.py sl

If you want to use this new SL step you will have to download big PGN files (chess files) and paste them into the data/play_data folder (FICS is a good source of data). You can also use the SCID program to filter by headers like player ELO, game result and more.

To avoid overfitting, I recommend using data sets of at least 3000 games and running at most 3-4 epochs.

Reinforcement Learning

This AlphaGo Zero implementation consists of three workers: self, opt and eval.

  • self is Self-Play to generate training data by self-play using BestModel.
  • opt is Trainer to train model, and generate next-generation models.
  • eval is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.

Distributed Training

Now it's possible to train the model in a distributed way. The only thing needed is to use the new parameter:

  • --type distributed: use mini config for testing, (see src/chess_zero/configs/distributed.py)

So, in order to contribute to the distributed team you just need to run the three workers locally like this:

python src/chess_zero/run.py self --type distributed (or python src/chess_zero/run.py sl --type distributed)
python src/chess_zero/run.py opt --type distributed
python src/chess_zero/run.py eval --type distributed

GUI

  • uci launches the Universal Chess Interface, for use in a GUI.

To set up ChessZero with a GUI, point it to C0uci.bat (or rename to .sh). For example, this is screenshot of the random model using Arena's self-play feature: capture

Data

  • data/model/model_best_*: BestModel.
  • data/model/next_generation/*: next-generation models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.

If you want to train the model from the beginning, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want to use GPU,

pip install tensorflow-gpu

Make sure Keras is using Tensorflow and you have Python 3.6.3+.

Basic Usage

For training model, execute Self-Play, Trainer and Evaluator.

Self-Play

python src/chess_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new BestModel
  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)

Trainer

python src/chess_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every epoch.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --total-step: specify total step(mini-batch) numbers. The total step affects learning rate of training.

Evaluator

python src/chess_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)

Tips and Memory

GPU Memory

Usually the lack of memory cause warnings, not error. If error happens, try to change vram_frac in src/configs/mini.py,

self.vram_frac = 1.0

Smaller batch_size will reduce memory usage of opt. Try to change TrainerConfig#batch_size in MiniConfig.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].