All Projects → simonmeister → pysc2-rl-agents

simonmeister / pysc2-rl-agents

Licence: MIT License
StarCraft II / PySC2 Deep Reinforcement Learning Agents (A2C)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pysc2-rl-agents

Deep-Reinforcement-Learning-Notebooks
This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contain both theory and implementation of different algorithms.
Stars: ✭ 15 (-87.9%)
Mutual labels:  deep-reinforcement-learning, a3c, a2c
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+79.03%)
Mutual labels:  deep-reinforcement-learning, a3c, a2c
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+1554.03%)
Mutual labels:  deep-reinforcement-learning, a3c, a2c
yarll
Combining deep learning and reinforcement learning.
Stars: ✭ 84 (-32.26%)
Mutual labels:  deep-reinforcement-learning, a3c
Reinforcement Learning
Minimal and Clean Reinforcement Learning Examples
Stars: ✭ 2,863 (+2208.87%)
Mutual labels:  deep-reinforcement-learning, a3c
pySC2 minigames
Curated list of pysc2 mini-games . Singleton Environmnets.Debugged by @SoyGema and mini-game authors
Stars: ✭ 43 (-65.32%)
Mutual labels:  pysc2, pysc2-mini-games
Baby A3c
A high-performance Atari A3C agent in 180 lines of PyTorch
Stars: ✭ 144 (+16.13%)
Mutual labels:  deep-reinforcement-learning, a3c
deep rl acrobot
TensorFlow A2C to solve Acrobot, with synchronized parallel environments
Stars: ✭ 32 (-74.19%)
Mutual labels:  deep-reinforcement-learning, a3c
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (-68.55%)
Mutual labels:  deep-reinforcement-learning, a2c
rl implementations
No description or website provided.
Stars: ✭ 40 (-67.74%)
Mutual labels:  deep-reinforcement-learning, a2c
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-73.39%)
Mutual labels:  deep-reinforcement-learning, a3c
sc2gym
PySC2 OpenAI Gym Environments
Stars: ✭ 50 (-59.68%)
Mutual labels:  starcraft-ii, pysc2
Deeprl
Modularized Implementation of Deep RL Algorithms in PyTorch
Stars: ✭ 2,640 (+2029.03%)
Mutual labels:  deep-reinforcement-learning, a2c
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+2022.58%)
Mutual labels:  deep-reinforcement-learning, a2c
Deep-rl-mxnet
Mxnet implementation of Deep Reinforcement Learning papers, such as DQN, PG, DDPG, PPO
Stars: ✭ 26 (-79.03%)
Mutual labels:  deep-reinforcement-learning, a2c
a3c-super-mario-pytorch
Reinforcement Learning for Super Mario Bros using A3C on GPU
Stars: ✭ 35 (-71.77%)
Mutual labels:  deep-reinforcement-learning, a3c
A3c Pytorch
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch
Stars: ✭ 108 (-12.9%)
Mutual labels:  deep-reinforcement-learning, a3c
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-4.84%)
Mutual labels:  deep-reinforcement-learning, a3c
pytorch-noreward-rl
pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction
Stars: ✭ 79 (-36.29%)
Mutual labels:  deep-reinforcement-learning, a3c
imitation learning
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Stars: ✭ 93 (-25%)
Mutual labels:  deep-reinforcement-learning, a2c

PySC2 Deep RL Agents

This repository implements a Advantage Actor-Critic agent baseline for the pysc2 environment as described in the DeepMind paper StarCraft II: A New Challenge for Reinforcement Learning. We use a synchronous variant of A3C (A2C) to effectively train on GPUs and otherwise stay as close as possible to the agent described in the paper.

This repository is part of a research project at the Autonomous Systems Labs , TU Darmstadt by Daniel Palenicek, Marcel Hussing, and Simon Meister.

Progress

  • A2C agent
  • FullyConv architecture
  • support all spatial screen and minimap observations as well as non-spatial player observations
  • support the full action space as described in the DeepMind paper (predicting all arguments independently)
  • support training on all mini games
  • report results for all mini games
  • LSTM architecture
  • Multi-GPU training

License

This project is licensed under the MIT License (refer to the LICENSE file for details).

Results

On the mini games, we get the following results:

Map best mean score (ours) best mean score (DeepMind) episodes (ours)
MoveToBeacon 26 26 8K
CollectMineralShards 97 103 300K
FindAndDefeatZerglings 45 45 450K
DefeatRoaches 65 100 -
DefeatZerglingsAndBanelings 68 62 -
CollectMineralsAndGas - 3978 -
BuildMarines - 3 -

In the following we show plots for the score over episodes.

MoveToBeacon

CollectMineralShards

FindAndDefeatZerglings

Note that the DeepMind mean scores are their best individual scores after 100 runs for each game, where the initial learning rate was randomly sampled for each run. We use a constant initial learning rate for a much smaller number of runs due to limited hardware. All agents use the same FullyConv agent.

With default settings (32 environments), learning MoveToBeacon well takes between 3K and 8K total episodes. This varies each run depending on random initialization and action sampling.

Usage

Hardware Requirements

  • for fast training, a GPU is recommended. We ran each experiment on a single Titan X Pascal (12GB).

Software Requirements

  • Python 3
  • pysc2 (tested with v1.2)
  • TensorFlow (tested with 1.4.0)
  • StarCraft II and mini games (see below or pysc2)

Quick Install Guide

  • pip install numpy tensorflow-gpu pysc2==1.2
  • Install StarCraft II. On Linux, use 3.16.1.
  • Download the mini games and extract them to your StarcraftII/Maps/ directory.

Train & run

  • run and train: python run.py my_experiment --map MoveToBeacon.
  • run and evalutate without training: python run.py my_experiment --map MoveToBeacon --eval.

You can visualize the agents during training or evaluation with the --vis flag. See run.py for all arguments.

Summaries are written to out/summary/<experiment_name> and model checkpoints are written to out/models/<experiment_name>.

Acknowledgments

The code in rl/environment.py is based on OpenAI baselines, with adaptions from sc2aibot. Some of the code in rl/agents/a2c/runner.py is loosely based on sc2aibot.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].