All Projects → Zeta36 → connect4-alpha-zero

Zeta36 / connect4-alpha-zero

Licence: MIT License
Connect4 reinforcement learning by AlphaGo Zero methods.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to connect4-alpha-zero

alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
Stars: ✭ 68 (-33.33%)
Mutual labels:  connect4, alphago-zero
sai
SAI: a fork of Leela Zero with variable komi.
Stars: ✭ 92 (-9.8%)
Mutual labels:  alphago-zero
saltzero
Machine learning bot for ultimate tic-tac-toe based on DeepMind's AlphaGo Zero paper. C++ and Python.
Stars: ✭ 27 (-73.53%)
Mutual labels:  alphago-zero
Deep-Reinforcement-Learning-for-Boardgames
Master Thesis project that provides a training framework for two player games. TicTacToe and Othello have already been implemented.
Stars: ✭ 17 (-83.33%)
Mutual labels:  connect4
connect4
Solving board games like Connect4 using Deep Reinforcement Learning
Stars: ✭ 33 (-67.65%)
Mutual labels:  alphago-zero
alphaFive
alphaGo版本的五子棋(gobang, gomoku)
Stars: ✭ 51 (-50%)
Mutual labels:  alphago-zero
alphazero
Board Game Reinforcement Learning using AlphaZero method. including Makhos (Thai Checkers), Reversi, Connect Four, Tic-tac-toe game rules
Stars: ✭ 24 (-76.47%)
Mutual labels:  alphago-zero
MyAlphaGoZeroOnConnect4
My Simple Implementation of AlphaGo Zero on Connect4
Stars: ✭ 16 (-84.31%)
Mutual labels:  alphago-zero
terminally bored terminal board games
board games for your terminal!
Stars: ✭ 53 (-48.04%)
Mutual labels:  connect4
KKAlphaGoZero
alphaGoZero论文的实现
Stars: ✭ 35 (-65.69%)
Mutual labels:  alphago-zero
Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+2419.61%)
Mutual labels:  alphago-zero
Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+2465.69%)
Mutual labels:  alphago-zero
Chess Alpha Zero
Chess reinforcement learning by AlphaGo Zero methods.
Stars: ✭ 1,868 (+1731.37%)
Mutual labels:  alphago-zero
Elf
ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation
Stars: ✭ 3,240 (+3076.47%)
Mutual labels:  alphago-zero

About

Connect4 reinforcement learning by AlphaGo Zero methods.

This project is based in two main resources:

  1. DeepMind's Oct19th publication: Mastering the Game of Go without Human Knowledge.
  2. The great Reversi development of the DeepMind ideas that @mokemokechicken did in his repo: https://github.com/mokemokechicken/reversi-alpha-zero

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaGo Zero implementation consists of three worker self, opt and eval.

  • self is Self-Play to generate training data by self-play using BestModel.
  • opt is Trainer to train model, and generate next-generation models.
  • eval is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.

Evaluation

For evaluation, you can play chess with the BestModel.

  • play_gui is Play Game vs BestModel using ASCII character encoding.

Data

  • data/model/model_best_*: BestModel.
  • data/model/next_generation/*: next-generation models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.

If you want to train the model from the beginning, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want use GPU,

pip install tensorflow-gpu

set environment variables

Create .env file and write this.

KERAS_BACKEND=tensorflow

Basic Usages

For training model, execute Self-Play, Trainer and Evaluator.

Self-Play

python src/connect4_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new BestModel
  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)

Trainer

python src/connect4_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every 2000 steps(mini-batch) after epoch.

options

  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)
  • --total-step: specify total step(mini-batch) numbers. The total step affects learning rate of training.

Evaluator

python src/connect4_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)

Play Game

python src/connect4_zero/run.py play_gui

When executed, ordinary chess board will be displayed in ASCII code and you can play against BestModel.

Tips and Memo

GPU Memory

Usually the lack of memory cause warnings, not error. If error happens, try to change per_process_gpu_memory_fraction in src/worker/{evaluate.py,optimize.py,self_play.py},

tf_util.set_session_config(per_process_gpu_memory_fraction=0.2)

Less batch_size will reduce memory usage of opt. Try to change TrainerConfig#batch_size in NormalConfig.

Model Performance

The following table is records of the best models.

best model generation winning percentage to best model Time Spent(hours) note
1 - -  
2 100% 1
3 84,6% 1
4 78,6% 2 This model is good enough to avoid naive losing movements
5 100% 1 The NN learns to play always in the center when it moves first
6 100% 4 The model now is able to win any online Connect4 game with classic AI I've found
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].