All Projects → witchu → alphazero

witchu / alphazero

Licence: Apache-2.0 license
Board Game Reinforcement Learning using AlphaZero method. including Makhos (Thai Checkers), Reversi, Connect Four, Tic-tac-toe game rules

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to alphazero

alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
Stars: ✭ 68 (+183.33%)
Mutual labels:  tic-tac-toe, connect-four, reversi, othello, alphago-zero, alphazero
Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+10804.17%)
Mutual labels:  othello, alphago-zero, alphazero
alphaFive
alphaGo版本的五子棋(gobang, gomoku)
Stars: ✭ 51 (+112.5%)
Mutual labels:  alphago-zero, alphazero
pedax
Reversi Board with edax, which is the strongest reversi engine.
Stars: ✭ 18 (-25%)
Mutual labels:  reversi, othello
reversi
Multiplayer Reversi Game on Internet Computer
Stars: ✭ 62 (+158.33%)
Mutual labels:  reversi, othello
saltzero
Machine learning bot for ultimate tic-tac-toe based on DeepMind's AlphaGo Zero paper. C++ and Python.
Stars: ✭ 27 (+12.5%)
Mutual labels:  alphago-zero, alphazero
Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+10608.33%)
Mutual labels:  alphago-zero, alphazero
checkers
A game of checkers written using minmax algorithm and alpha-beta pruning.
Stars: ✭ 19 (-20.83%)
Mutual labels:  checkers, draughts
boardgame
An online board game playground. Including lobby, chat
Stars: ✭ 11 (-54.17%)
Mutual labels:  tic-tac-toe
tictactoe-ai-tfjs
Train your own TensorFlow.js Tic Tac Toe
Stars: ✭ 45 (+87.5%)
Mutual labels:  tic-tac-toe
TicTacToe-SwiftUI
Unidirectional data flow tic-tac-toe sample with SwiftUI.
Stars: ✭ 22 (-8.33%)
Mutual labels:  tic-tac-toe
tic-tac-toe
Tic Tac Toe game implementation in Elm
Stars: ✭ 21 (-12.5%)
Mutual labels:  tic-tac-toe
KKAlphaGoZero
alphaGoZero论文的实现
Stars: ✭ 35 (+45.83%)
Mutual labels:  alphago-zero
ymir-js
This toolkit is created to make it easier for you to develop games like chess, checkers, go, match 3 puzzle and more. It is still under development.
Stars: ✭ 30 (+25%)
Mutual labels:  checkers
tic-tac-toe
Play tic-tac-toe in your Terminal
Stars: ✭ 42 (+75%)
Mutual labels:  tic-tac-toe
ultimate-tictactoe
An implementation of ultimate tictactoe in Elm
Stars: ✭ 15 (-37.5%)
Mutual labels:  tic-tac-toe
AndTTT
🎲 Simple tic tac toe game for Android
Stars: ✭ 15 (-37.5%)
Mutual labels:  tic-tac-toe
TicTacToeUI-Android
Check out the new style for App Design aims for Tic Tac Toe Game...😉😀😁😎
Stars: ✭ 40 (+66.67%)
Mutual labels:  tic-tac-toe
discord-tictactoe
Highly customizable innovative Discord Bot for playing Tic-Tac-Toe 🎮🏅
Stars: ✭ 84 (+250%)
Mutual labels:  tic-tac-toe
ultimate-ttt
Play "Ultimate Tic-Tac-Toe" in the browser 🚀
Stars: ✭ 20 (-16.67%)
Mutual labels:  tic-tac-toe

Board Game Reinforcement Learning ที่อ้างอิงจาก AlphaZero ของ Deepmind

เกม

  • หมากฮอส (Makhos / Thai Checkers)
  • โอเทลโล่ (Othello / Reversi)
  • Connect Four
  • โอ-เอ็กซ์ (Tic-tac-toe)

วิธีใช้งาน

ความต้องการ

  • Python 2.7+ หรือ Python 3.6+
  • Keras 2.1+
  • Tensorflow

Setup

pip install -r requirements.txt

ลองเล่นกับ pretrained model

python run.py arena makhos human mcts,data/makhos/model-45k.h5,1000

สร้าง Model

python run.py newmodel <game> model.h5

Self-play

เล่นกับตัวเอง 5,000 เกม โดยแต่ละตาที่เดินมีการซิมมูเลชั่น 100 ครั้ง และเซฟข้อมูลเกมลงไฟล์ selfplay.txt

python run.py generate <game> --model model.h5 --simulation 100 -n 5000 --file selfplay.txt --progress

Training

ทำการเทรนโมเดล model.h5 ด้วยไฟล์ข้อมูล selfplay.txt จำนวน 3 epochs และเซฟใส่ newmodel.h5

python run.py train <game> selfplay.txt model.h5 newmodel.h5 --epoch 3 --progress

ทดสอบ

python run.py arena <game> <player1> <player2>

โดยที่

  • game มีค่าเป็น makhos, othello, c4, ox
  • player1 และ player2 มีค่าเป็น
    • human เลือกตาเดินจากคีย์บอร์ด
    • mcts,model.h5,1000 เลือกตาเดนถัดไปโดยใช้ policy network + value network จาก model.h5 และใช้ MCTS ที่มีจำนวนซิมมูเลชั่น 1,000 ครั้ง
    • policy,model.h5 เลือกตาเดินโดยใช้แต่ policy network ของ model.h5
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].