All Projects → xuetf → AlphaZero_Gobang

xuetf / AlphaZero_Gobang

Licence: other
Deep Learning big homework of UCAS

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to AlphaZero Gobang

Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+8762.07%)
Mutual labels:  mcts, gomoku, gobang, alphazero
Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+8924.14%)
Mutual labels:  mcts, gomoku, gobang, alphazero
alphaFive
alphaGo版本的五子棋(gobang, gomoku)
Stars: ✭ 51 (+75.86%)
Mutual labels:  gomoku, gobang, alphazero
alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
Stars: ✭ 68 (+134.48%)
Mutual labels:  mcts, alphazero
gobang
一个五子棋AI,使用原生JavaScript开发
Stars: ✭ 22 (-24.14%)
Mutual labels:  gomoku, gobang
gomoku-battle
Gomoku Battle is a cross-language cross-system battle platform.
Stars: ✭ 18 (-37.93%)
Mutual labels:  gomoku, gobang
alpha sigma
A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.
Stars: ✭ 134 (+362.07%)
Mutual labels:  gomoku, alphazero
muzero
A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.
Stars: ✭ 126 (+334.48%)
Mutual labels:  mcts, alphazero
AU Recognition
AU_Recognition based on CKPlus/CK database
Stars: ✭ 21 (-27.59%)
Mutual labels:  residual-networks
gomoku-wasm
A Gomoku game implements with WebAssembly
Stars: ✭ 30 (+3.45%)
Mutual labels:  gomoku
connect4
Solving board games like Connect4 using Deep Reinforcement Learning
Stars: ✭ 33 (+13.79%)
Mutual labels:  residual-networks
saltzero
Machine learning bot for ultimate tic-tac-toe based on DeepMind's AlphaGo Zero paper. C++ and Python.
Stars: ✭ 27 (-6.9%)
Mutual labels:  alphazero
Learning-Lab-C-Library
This library provides a set of basic functions for different type of deep learning (and other) algorithms in C.This deep learning library will be constantly updated
Stars: ✭ 20 (-31.03%)
Mutual labels:  residual-networks
ReZero-ResNet
Unofficial pytorch implementation of ReZero in ResNet
Stars: ✭ 23 (-20.69%)
Mutual labels:  residual-networks
blackstone
Gomoku (Five in a Row) game manager with a powerful built-in AI, written in Java with a clean, minimal interface.
Stars: ✭ 33 (+13.79%)
Mutual labels:  gomoku
godpaper
🐵 An AI chess-board-game framework(by many programming languages) implementations.
Stars: ✭ 40 (+37.93%)
Mutual labels:  mcts
AnimalChess
Animal Fight Chess Game(斗兽棋) written in rust.
Stars: ✭ 76 (+162.07%)
Mutual labels:  alphazero
python-gobang
Gobang developed with python/python 五子棋
Stars: ✭ 35 (+20.69%)
Mutual labels:  gobang
resnet-cifar10
ResNet for Cifar10
Stars: ✭ 21 (-27.59%)
Mutual labels:  residual-networks
computer-go-dataset
datasets for computer go
Stars: ✭ 133 (+358.62%)
Mutual labels:  alphazero

Overview

This is a AlphaZero Implementation of Gobang based on Pytorch.

Pytorch 0.3.1 Install

https://ptorch.com/news/145.html

Github

Code can be viewed in my github:https://github.com/xuetf/AlphaZero_Gobang

Design

RL framework

framework

Network Structure

structure

Class Diagram

diagram

Illustration can be viewed in my blog: http://xtf615.com/2018/02/24/AlphaZeroDesign/

Final Report(CVPR format)

Final Report: Analysis and Implementation of Deep Reinforcement Learning Based Gobang

Code

  • Train.py : Run the train process
  • Run.py : Play with Human using the trained model
  • Player.py: Base class for different Player
  • RolloutPlayer.py: Player with MCTS using random rollout policy
  • AlphaZeroPlayer.py: AlphaZero Player with MCTS guided by Residual Network
  • HumanPlayer.py: Human Player
  • MCTS.py: Base class for different MCTS
  • AlphaZeroMCTS.py: MCTS guided by Residual Network
  • RolloutMCTS.py: MCTS using random rollout policy
  • TreeNode.py: MCTS Tree Node
  • PolicyValueNet.py: Redisual Network Implementation based on Pytorch
  • Board.py: Board Class for Gobang
  • Game.py: Game for Gobang
  • VisualTool.py: Tk Tool for visualizing Chess Board
  • Config.py: store config. Serve as a snapshot when resuming

Running Code

Training

  • Train from scratch:

python3 Train.py

  • Train as a background job,then:

nohup python3 -u Train.py > train.log 2>&1 &

  • Train from a checkpoint:

python3 Train.py --config data/model_name.pkl

Play game

python3 Run.py

Result

game

Download or Upload From your OWN remote server

Download the trained model from remote server

scp root@ip:/usr/local/workspace/AlphaZero_Gobang/data/current_policy_resnet_epochs_1500.model /Users/xuetf/Downloads

Upload -P

scp -P 8381 local_file_path [email protected]:/root/

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].