Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → pandezhao → alpha_sigma

pandezhao / alpha_sigma

Licence: other

A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning deep-learning deep-reinforcement-learning pytorch gomoku monte-carlo-tree-search gomoku-game pytorch-rl alphazero

Projects that are alternatives of or similar to alpha sigma

Alphazero gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Stars: ✭ 2,570 (+1817.91%)

Mutual labels: gomoku, monte-carlo-tree-search, alphazero

Alpha Zero General

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Stars: ✭ 2,617 (+1852.99%)

Mutual labels: gomoku, monte-carlo-tree-search, alphazero

Deep Reinforcement Learning

Repo for the Deep Reinforcement Learning Nanodegree program

Stars: ✭ 4,012 (+2894.03%)

Mutual labels: deep-reinforcement-learning, pytorch-rl

Animal Fight Chess Game（斗兽棋） written in rust.

Stars: ✭ 76 (-43.28%)

Mutual labels: monte-carlo-tree-search, alphazero

alphaGo版本的五子棋(gobang, gomoku)

Stars: ✭ 51 (-61.94%)

Mutual labels: gomoku, alphazero

AlphaZero Gobang

Deep Learning big homework of UCAS

Stars: ✭ 29 (-78.36%)

Mutual labels: gomoku, alphazero

一个五子棋AI，使用原生JavaScript开发

Stars: ✭ 22 (-83.58%)

Mutual labels: gomoku, gomoku-game

A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

Stars: ✭ 126 (-5.97%)

Mutual labels: deep-reinforcement-learning, alphazero

Using self-play, MCTS, and a deep neural network to create a hearthstone ai player

Stars: ✭ 24 (-82.09%)

Mutual labels: deep-reinforcement-learning, monte-carlo-tree-search

pomdp-baselines

Simple (but often Strong) Baselines for POMDPs in PyTorch - ICML 2022

Stars: ✭ 162 (+20.9%)

Mutual labels: deep-reinforcement-learning

Cloud-native Financial Reinforcement Learning

Stars: ✭ 179 (+33.58%)

Mutual labels: deep-reinforcement-learning

Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex

Stars: ✭ 33 (-75.37%)

Mutual labels: deep-reinforcement-learning

decentralized-rl

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Stars: ✭ 40 (-70.15%)

Mutual labels: deep-reinforcement-learning

Lightweight deep RL Libraray for continuous control.

Stars: ✭ 14 (-89.55%)

Mutual labels: deep-reinforcement-learning

Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

Live Trading. Please star.

Stars: ✭ 1,251 (+833.58%)

Mutual labels: deep-reinforcement-learning

Meta-Learning-for-StarCraft-II-Minigames

We reproduced DeepMind's results and implement a meta-learning (MLSH) agent which can generalize across minigames.

Stars: ✭ 26 (-80.6%)

Mutual labels: deep-reinforcement-learning

motion-planner-reinforcement-learning

End to end motion planner using Deep Deterministic Policy Gradient (DDPG) in gazebo

Stars: ✭ 99 (-26.12%)

Mutual labels: deep-reinforcement-learning

pytorch-noreward-rl

pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction

Stars: ✭ 79 (-41.04%)

Mutual labels: deep-reinforcement-learning

An out-of-the-box GUI tool for offline deep reinforcement learning

Stars: ✭ 80 (-40.3%)

Mutual labels: deep-reinforcement-learning

imitation learning

PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.

Stars: ✭ 93 (-30.6%)

Mutual labels: deep-reinforcement-learning

View All Similar Projects ➔

alpha_sigma

目前可视化文件主要写在GUI.py,如果只想体验下游戏模型或者查看模型自我下棋时的对弈过程，只需要调用GUI.py就够了。

main.py 主程序，程序入口

GUI.py 用于交互的可视化界面

我们在这里提供了两种模式：

  游戏模式： 从终端调用命令： python GUI.py --mode game --game_model model_5400.pkl  其中model_5400.pkl是已经训练好的神经网络，通过--mode指定模式为游戏模式，并通过--game_model装载已经训练好的模型。我们这里提供了一个模型文件：model_5400.pkl.(PS:家用机计算速度慢，下一步棋大概需要等7秒钟)
  
  展示模式： 展示神经网络训练过程中机器自我对弈的结果。从终端调用命令： python GUI.py --mode display --display_file test_5200.pkl 我们这里提供了游戏记录。

new_MCTS.py 蒙特卡罗树程序

network.py 神经网络程序

five_stone_game.py 五子棋游戏程序

utils.py 用来装闲杂文件

现在该套程序首发在知乎，知乎链接：https://zhuanlan.zhihu.com/p/59567014 欢迎大家去帮我点赞

English Version:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 134

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗