All Projects → yilundu → Generals_a3c

yilundu / Generals_a3c

Online repo for deep reinforcement learning (A3C) on generals.io

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Generals a3c

Avoid Rage Ld31
Minimal reflex game where you must avoid enemies coming at you from opposing tracks - Made for Ludum Dare 31 - Entire Game on One Screen
Stars: ✭ 7 (-41.67%)
Mutual labels:  game
Angband
A free, single-player roguelike dungeon exploration game
Stars: ✭ 849 (+6975%)
Mutual labels:  game
Smscompetitioncart
Sega Master System version of the Nintendo World Championships competition cartridge
Stars: ✭ 10 (-16.67%)
Mutual labels:  game
Librg
🚀 Making multi-player gamedev simpler since 2017
Stars: ✭ 813 (+6675%)
Mutual labels:  game
Rott94
Rise of the Triad source port to SDL2, Android and WinRT
Stars: ✭ 8 (-33.33%)
Mutual labels:  game
Strider
💣 Strider is a 2D sci-fi platformer (Game Off 2015 entry)
Stars: ✭ 9 (-25%)
Mutual labels:  game
Pokemonunity
A framework to build Pokémon RPG games.
Stars: ✭ 934 (+7683.33%)
Mutual labels:  game
Etherboy Core
Etherboy Game Smart Contract for Loom DAppChain
Stars: ✭ 12 (+0%)
Mutual labels:  game
Fimbulclient
Open source Ragnarök Online client in C++
Stars: ✭ 8 (-33.33%)
Mutual labels:  game
Xoreos
A reimplementation of BioWare's Aurora engine (and derivatives). Pre-pre-alpha :P
Stars: ✭ 856 (+7033.33%)
Mutual labels:  game
Frag.exe
Multiplayer First-Person Shooter written in C++ using my own engine, Qor
Stars: ✭ 8 (-33.33%)
Mutual labels:  game
H2ogame
🙉 PHP Arcade Game Script. Create your own online Arcade Game platform with rating system, categories, full HTML5 design, full admin panel, pages management, advertising management, and much more --- H2OGame was a commercial script (which was sold for £27 per license). Now, it is released as a FREE (and still Open Source) software and under an Open Source license!
Stars: ✭ 8 (-33.33%)
Mutual labels:  game
Retrotanks
RetroTanks: Atari Combat Reimagined, built in Meteor.js. Great isomorphic JavaScript example.
Stars: ✭ 9 (-25%)
Mutual labels:  game
Hcwiid
Haskell binding for CWiid (wiimote)
Stars: ✭ 7 (-41.67%)
Mutual labels:  game
Epoh
Multiplayer turn-based browser strategy game
Stars: ✭ 11 (-8.33%)
Mutual labels:  game
Zemeroth
😠⚔️😈 A minimalistic 2D turn-based tactical game in Rust
Stars: ✭ 940 (+7733.33%)
Mutual labels:  game
Squib
A Ruby DSL for prototyping card games.
Stars: ✭ 850 (+6983.33%)
Mutual labels:  game
Much Assembly Required
Assembly programming game
Stars: ✭ 869 (+7141.67%)
Mutual labels:  game
Roguelike Dungeon Crawler
react.js game
Stars: ✭ 11 (-8.33%)
Mutual labels:  game
Godot Open Rpg
Learn to create turn-based combat with this Open Source RPG demo ⚔
Stars: ✭ 855 (+7025%)
Mutual labels:  game

generals_a3c

This repository contains code to simulate generals.io games and to train both policy networks and actor critic networks(asynchronously) to play generals.io.

It provides:

  1. Simulation of gioreplay files.
  2. Client to play online generals.io games.
  3. Virtual simulation of generals.io games with autogenerated boards.
  4. Dataset generation.
  5. Code to train a convolutional policy network on generated dataset.
  6. OpenAI Gym like environment to interact with generals game(generalsenv.py)
  7. Code for A3C convolutional network training on generals.io game.

Link of convolutional policy network playing generals.io

Generals Dataset Generation

To generate a labeled supervised move dataset to train policy networks run the following commands:

  • First download and unzip online database of files found here.
  • After downloading database run generate_data.py which will generate datasets data_x.npy, data_y.npy, data_z.npy. data_x represents an expanded feature map for generals board, while data_y and data_z represent start and end tiles for moves
usage: generate_data.py [-h] [--processes PROCESSES] [--data DATA]
                        [--stars STARS] [--players PLAYERS]

optional arguments:
  -h, --help            show this help message and exit
  --processes PROCESSES
  --data DATA           directory where the gioreplay files are stored
  --stars STARS         threshold for stars to parse games from
  --players PLAYERS     number of players needed so that we parse game from

Policy Bot Training

We can train a policy network to play the game of generals by training a bot to predict both start and end locations.

To train policy network use the following code:

usage: policy_trainer.py [-h] [--on-gpu ON_GPU] [--num-epochs NUM_EPOCHS]
                         [--data DATA] [--lr LR]

optional arguments:
  -h, --help            show this help message and exit
  --on-gpu ON_GPU
  --num-epochs NUM_EPOCHS
                        number of epochs to train network
  --data DATA           directory containing data directory
  --lr LR               LR

The data should be generated using generate_data.py The python file outputs saved model at 'policy.mdl'

Running Policy Bot on Online Server

To test the policy network on private online bot server on generals.io use:

usage: policy_online_client.py [-h] [--user_id USER_ID] [--username USERNAME]
                               [--game_id GAME_ID] [--model_path MODEL_PATH]

Policy Bot Player

optional arguments:
  -h, --help            show this help message and exit
  --user_id USER_ID     user_id for bot
  --username USERNAME   username for bot
  --game_id GAME_ID     id for the game
  --model_path MODEL_PATH
                        path of policy model

To quickly demo policy bot, clone the repo and run

python policy_online_client.py

and then go to the URL here

A3C Bot Training

We an also train a generals.io bot using reinforcement learning. Specifically, we create a generals.io environment with a bundled policy bot. Our bot then interacts with this generals.io environment and receives rewards each time it takes a tile, city or general.

To train the A3C network use the following code:

usage: main.py [-h] [--lr LR] [--gamma GAMMA] [--tau TAU]
               [--entropy-coef ENTROPY_COEF]
               [--value-loss-coef VALUE_LOSS_COEF]
               [--max-grad-norm MAX_GRAD_NORM] [--seed SEED]
               [--num-processes NUM_PROCESSES] [--num-steps NUM_STEPS]
               [--max-episode-length MAX_EPISODE_LENGTH]
               [--no-shared NO_SHARED] [--off-tile-coef OFF_TILE_COEF]
               [--checkpoint-interval CHECKPOINT_INTERVAL]

A3C

optional arguments:
  -h, --help            show this help message and exit
  --lr LR               learning rate (default: 0.00001)
  --gamma GAMMA         discount factor for rewards (default: 0.99)
  --tau TAU             parameter for GAE (default: 1.00)
  --entropy-coef ENTROPY_COEF
                        entropy term coefficient (default: 0.01)
  --value-loss-coef VALUE_LOSS_COEF
                        value loss coefficient (default: 0.5)
  --max-grad-norm MAX_GRAD_NORM
                        value loss coefficient (default: 25)
  --seed SEED           random seed (default: 1)
  --num-processes NUM_PROCESSES
                        how many training processes to use (default: 16)
  --num-steps NUM_STEPS
                        number of forward steps in A3C (default: 30)
  --max-episode-length MAX_EPISODE_LENGTH
                        maximum length of an episode (default: 500
  --no-shared NO_SHARED
                        use an optimizer without shared momentum.
  --off-tile-coef OFF_TILE_COEF
                        weight to penalize bad movement
  --checkpoint-interval CHECKPOINT_INTERVAL
                        interval to save model

NUM_PROCESSES threads are used for training and 1 thread is used for evaluation. Model is saved at reinforce_trained.mdl

Running Actor Critic Bot on Online Server

To test the policy network on private online bot server on generals.io use:

usage: reinforce_online_client.py [-h] [--user_id USER_ID]
                                  [--username USERNAME] [--game_id GAME_ID]
                                  [--model_path MODEL_PATH]

Reinforcement Trained Bot Player

optional arguments:
  -h, --help            show this help message and exit
  --user_id USER_ID     user_id for bot
  --username USERNAME   username for bot
  --game_id GAME_ID     id for the game
  --model_path MODEL_PATH
                        path of a3c trained model

To quickly demo policy bot, clone the repo and run

python reinforce_online_client.py

and then go to the URL here

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].