All Projects → pandezhao → alpha_sigma

pandezhao / alpha_sigma

Licence: other
A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to alpha sigma

Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+1817.91%)
Mutual labels:  gomoku, monte-carlo-tree-search, alphazero
Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+1852.99%)
Mutual labels:  gomoku, monte-carlo-tree-search, alphazero
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+2894.03%)
Mutual labels:  deep-reinforcement-learning, pytorch-rl
AnimalChess
Animal Fight Chess Game(斗兽棋) written in rust.
Stars: ✭ 76 (-43.28%)
Mutual labels:  monte-carlo-tree-search, alphazero
alphaFive
alphaGo版本的五子棋(gobang, gomoku)
Stars: ✭ 51 (-61.94%)
Mutual labels:  gomoku, alphazero
AlphaZero Gobang
Deep Learning big homework of UCAS
Stars: ✭ 29 (-78.36%)
Mutual labels:  gomoku, alphazero
gobang
一个五子棋AI,使用原生JavaScript开发
Stars: ✭ 22 (-83.58%)
Mutual labels:  gomoku, gomoku-game
muzero
A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.
Stars: ✭ 126 (-5.97%)
Mutual labels:  deep-reinforcement-learning, alphazero
alphastone
Using self-play, MCTS, and a deep neural network to create a hearthstone ai player
Stars: ✭ 24 (-82.09%)
Mutual labels:  deep-reinforcement-learning, monte-carlo-tree-search
pomdp-baselines
Simple (but often Strong) Baselines for POMDPs in PyTorch - ICML 2022
Stars: ✭ 162 (+20.9%)
Mutual labels:  deep-reinforcement-learning
FinRL Podracer
Cloud-native Financial Reinforcement Learning
Stars: ✭ 179 (+33.58%)
Mutual labels:  deep-reinforcement-learning
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-75.37%)
Mutual labels:  deep-reinforcement-learning
decentralized-rl
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)
Stars: ✭ 40 (-70.15%)
Mutual labels:  deep-reinforcement-learning
LWDRLC
Lightweight deep RL Libraray for continuous control.
Stars: ✭ 14 (-89.55%)
Mutual labels:  deep-reinforcement-learning
Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020
Live Trading. Please star.
Stars: ✭ 1,251 (+833.58%)
Mutual labels:  deep-reinforcement-learning
Meta-Learning-for-StarCraft-II-Minigames
We reproduced DeepMind's results and implement a meta-learning (MLSH) agent which can generalize across minigames.
Stars: ✭ 26 (-80.6%)
Mutual labels:  deep-reinforcement-learning
motion-planner-reinforcement-learning
End to end motion planner using Deep Deterministic Policy Gradient (DDPG) in gazebo
Stars: ✭ 99 (-26.12%)
Mutual labels:  deep-reinforcement-learning
pytorch-noreward-rl
pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction
Stars: ✭ 79 (-41.04%)
Mutual labels:  deep-reinforcement-learning
minerva
An out-of-the-box GUI tool for offline deep reinforcement learning
Stars: ✭ 80 (-40.3%)
Mutual labels:  deep-reinforcement-learning
imitation learning
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Stars: ✭ 93 (-30.6%)
Mutual labels:  deep-reinforcement-learning

alpha_sigma

目前可视化文件主要写在GUI.py,如果只想体验下游戏模型或者查看模型自我下棋时的对弈过程,只需要调用GUI.py就够了。

main.py 主程序,程序入口

GUI.py 用于交互的可视化界面

我们在这里提供了两种模式:

  游戏模式: 从终端调用命令: python GUI.py --mode game --game_model model_5400.pkl  其中model_5400.pkl是已经训练好的神经网络,通过--mode指定模式为游戏模式,并通过--game_model装载已经训练好的模型。我们这里提供了一个模型文件:model_5400.pkl.(PS:家用机计算速度慢,下一步棋大概需要等7秒钟)
  
  展示模式: 展示神经网络训练过程中机器自我对弈的结果。从终端调用命令: python GUI.py --mode display --display_file test_5200.pkl 我们这里提供了游戏记录。

new_MCTS.py 蒙特卡罗树程序

network.py 神经网络程序

five_stone_game.py 五子棋游戏程序

utils.py 用来装闲杂文件

现在该套程序首发在知乎,知乎链接:https://zhuanlan.zhihu.com/p/59567014 欢迎大家去帮我点赞

English Version:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].