All Projects → sirmammingtonham → alphastone

sirmammingtonham / alphastone

Licence: Unlicense license
Using self-play, MCTS, and a deep neural network to create a hearthstone ai player

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to alphastone

Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+10804.17%)
Mutual labels:  monte-carlo-tree-search, alpha-zero, self-play
DI-star
An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.
Stars: ✭ 1,335 (+5462.5%)
Mutual labels:  deep-reinforcement-learning, self-play
alpha-zero
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
Stars: ✭ 68 (+183.33%)
Mutual labels:  alpha-zero, self-play
alpha sigma
A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.
Stars: ✭ 134 (+458.33%)
Mutual labels:  deep-reinforcement-learning, monte-carlo-tree-search
deep rl acrobot
TensorFlow A2C to solve Acrobot, with synchronized parallel environments
Stars: ✭ 32 (+33.33%)
Mutual labels:  deep-reinforcement-learning
Fruit-API
A Universal Deep Reinforcement Learning Framework
Stars: ✭ 61 (+154.17%)
Mutual labels:  deep-reinforcement-learning
jax-rl
JAX implementations of core Deep RL algorithms
Stars: ✭ 61 (+154.17%)
Mutual labels:  deep-reinforcement-learning
tpprl
Code and data for "Deep Reinforcement Learning of Marked Temporal Point Processes", NeurIPS 2018
Stars: ✭ 68 (+183.33%)
Mutual labels:  deep-reinforcement-learning
awesome-machine-learning-robotics
A curated list of resources about Machine Learning for Robotics
Stars: ✭ 52 (+116.67%)
Mutual labels:  deep-reinforcement-learning
DRL in CV
A course on Deep Reinforcement Learning in Computer Vision. Visit Website:
Stars: ✭ 59 (+145.83%)
Mutual labels:  deep-reinforcement-learning
ml course
"Learning Machine Learning" Course, Bogotá, Colombia 2019 #LML2019
Stars: ✭ 22 (-8.33%)
Mutual labels:  deep-reinforcement-learning
MyAlphaGoZeroOnConnect4
My Simple Implementation of AlphaGo Zero on Connect4
Stars: ✭ 16 (-33.33%)
Mutual labels:  monte-carlo-tree-search
DeepCubeA
Code for DeepCubeA, a Deep Reinforcement Learning algorithm that can learn to solve the Rubik's cube.
Stars: ✭ 92 (+283.33%)
Mutual labels:  deep-reinforcement-learning
a3c-super-mario-pytorch
Reinforcement Learning for Super Mario Bros using A3C on GPU
Stars: ✭ 35 (+45.83%)
Mutual labels:  deep-reinforcement-learning
reinforcement learning ppo rnd
Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some explanation
Stars: ✭ 33 (+37.5%)
Mutual labels:  deep-reinforcement-learning
ActiveRagdollControllers
Research into controllers for 2d and 3d Active Ragdolls (using MujocoUnity+ml_agents)
Stars: ✭ 30 (+25%)
Mutual labels:  deep-reinforcement-learning
Hearthstone-Hearthbuddy
Hearthstone 炉石传说 Hearthbuddy 炉石兄弟
Stars: ✭ 474 (+1875%)
Mutual labels:  hearthstone
AlphaGo.jl
AlphaGo Zero implementation using Flux.jl
Stars: ✭ 73 (+204.17%)
Mutual labels:  alpha-zero
deep-blueberry
If you've always wanted to learn about deep-learning but don't know where to start, then you might have stumbled upon the right place!
Stars: ✭ 17 (-29.17%)
Mutual labels:  deep-reinforcement-learning
revisiting rainbow
Revisiting Rainbow
Stars: ✭ 71 (+195.83%)
Mutual labels:  deep-reinforcement-learning

Alphastone - Hearthstone Reinforcement Learning AI!

A Hearthstone AI implementation for the fireplace simulator. Uses self-play to train and Monte Carlo Tree Search + Neural Network for decision making. Based off the alphazero algorithm (and its implementation by @suragnair alpha zero general).

Hearthstone is an imperfect information game, meaning that some information is always hidden to both players. This is different from the game of go or chess where your opponent's pieces are visible at all times. As a result, we have to randomize all hidden information (opponent's hand and deck) before every search. MCTS is performed on a set of all the information available to the current player (information set mcts).

The neural network is a small-ish resnet in PyTorch defined in alphanet.py. It is called to evaluate leaf nodes in the search tree and returns a matrix of action probabilities, and the predicted outcome.

References:

  1. AlphaZero: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
  2. Information Set Monte Carlo Tree Search
  3. https://hearthstone.gamepedia.com/Hearthstone_Wiki

Experiments

Trained a few different models over the course of 2 weeks but only using basic card set (~150 cards) and priest vs rogue. Best model was trained for around 3 days. It shows decision making and is able to beat a random agent ~80% of the time, but much much more training is needed. (Although it almost beat me once when I had terrible luck)

This is my first large python project and is written by a high school student. I don't have formal coding experience so all help and critique is appreciated!

TO-DO

  • CLEAN-UP CODE! there's a lot of comments and unnecessary bits from debugging
  • Change pit.py to allow for play against trained model
  • Implement ideas in ideas#1
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].