facebookresearch / Elf

Licence: other
An End-To-End, Lightweight and Flexible Platform for Game Research

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
javascript
184084 projects - #8 most used programming language
CMake
9771 projects
shell
77523 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Elf

Chemgan Challenge
Code for the paper: Benhenda, M. 2017. ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity? arXiv preprint arXiv:1708.08227.
Stars: ✭ 98 (-95.24%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Modelchimp
Experiment tracking for machine and deep learning projects
Stars: ✭ 121 (-94.12%)
Mutual labels:  artificial-intelligence, platform
Reinforcement Learning Cheat Sheet
Reinforcement Learning Cheat Sheet
Stars: ✭ 104 (-94.94%)
Mutual labels:  artificial-intelligence, reinforcement-learning
60 days rl challenge
60_Days_RL_Challenge中文版
Stars: ✭ 92 (-95.53%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Marvin Python Toolbox
Marvin AI has been accepted into the Apache Foundation and is now available at https://github.com/apache/incubator-marvin
Stars: ✭ 142 (-93.1%)
Mutual labels:  artificial-intelligence, platform
Papers Literature Ml Dl Rl Ai
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Stars: ✭ 1,341 (-34.81%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Awesome Ml Courses
Awesome free machine learning and AI courses with video lectures.
Stars: ✭ 2,145 (+4.28%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Ai Reading Materials
Some of the ML and DL related reading materials, research papers that I've read
Stars: ✭ 79 (-96.16%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Flappy Es
Flappy Bird AI using Evolution Strategies
Stars: ✭ 140 (-93.19%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Ravens
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.
Stars: ✭ 133 (-93.53%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Mapleai
AI各领域学习资料整理。(A collection of all skills and knowledges should be got command of to obtain an AI relevant job offer. There are online blogs, my personal blogs, electronic books copy.)
Stars: ✭ 89 (-95.67%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Awesome Ai
A curated list of artificial intelligence resources (Courses, Tools, App, Open Source Project)
Stars: ✭ 161 (-92.17%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Simulator
A ROS/ROS2 Multi-robot Simulator for Autonomous Vehicles
Stars: ✭ 1,260 (-38.75%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Rlai Exercises
Exercise Solutions for Reinforcement Learning: An Introduction [2nd Edition]
Stars: ✭ 97 (-95.28%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Snake
Artificial intelligence for the Snake game.
Stars: ✭ 1,241 (-39.67%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Reinforcement Learning An Introduction
Python Implementation of Reinforcement Learning: An Introduction
Stars: ✭ 11,042 (+436.8%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (-46.33%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Awesome Decision Making Reinforcement Learning
A selection of state-of-the-art research materials on decision making and motion planning.
Stars: ✭ 68 (-96.69%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Toycarirl
Implementation of Inverse Reinforcement Learning Algorithm on a toy car in a 2D world problem, (Apprenticeship Learning via Inverse Reinforcement Learning Abbeel & Ng, 2004)
Stars: ✭ 128 (-93.78%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Java Deep Learning Cookbook
Code for Java Deep Learning Cookbook
Stars: ✭ 156 (-92.42%)
Mutual labels:  artificial-intelligence, reinforcement-learning

ELF: An Extensive, Lightweight and Flexible Platform for Game Research

Overview

ELF is an Extensive, Lightweight and Flexible platform for game research, in particular for real-time strategy (RTS) games. On the C++-side, ELF hosts multiple games in parallel with C++ threading. On the Python side, ELF returns one batch of game state at a time, making it very friendly for modern RL. In comparison, other platforms (e.g., OpenAI Gym) wraps one single game instance with one Python interface. This makes concurrent game execution a bit complicated, which is a requirement of many modern reinforcement learning algorithms.

Besides, ELF now also provides a Python version for running concurrent game environments, by Python multiprocessing with ZeroMQ inter-process communication. See ./ex_elfpy.py for a simple example.

For research on RTS games, ELF comes with an fast RTS engine, and three concrete environments: MiniRTS, Capture the Flag and Tower Defense. MiniRTS has all the key dynamics of a real-time strategy game, including gathering resources, building facilities and troops, scouting the unknown territories outside the perceivable regions, and defend/attack the enemy. User can access its internal representation and can freely change the game setting.

Overview

ELF has the following characteristics:

  • End-to-End: ELF offers an end-to-end solution to game research. It provides miniature real-time strategy game environments, concurrent simulation, intuitive APIs, web-based visualzation, and also comes with a reinforcement learning backend empowered by Pytorch with minimal resource requirement.

  • Extensive: Any game with C/C++ interface can be plugged into this framework by writing a simple wrapper. As an example, we already incorporate Atari games into our framework and show that the simulation speed per core is comparable with single-core version, and is thus much faster than implementation using either multiprocessing or Python multithreading. In the future, we plan to incorporate more environments, e.g., DarkForest Go engine.

  • Lightweight: ELF runs very fast with minimal overhead. ELF with a simple game (MiniRTS) built on RTS engine runs 40K frame per second per core on a MacBook Pro. Training a model from scratch to play MiniRTS takes a day on 6 CPU + 1 GPU.

  • Flexible: Pairing between environments and actors is very flexible, e.g., one environment with one agent (e.g., Vanilla A3C), one environment with multiple agents (e.g., Self-play/MCTS), or multiple environment with one actor (e.g., BatchA3C, GA3C). Also, any game built on top of the RTS engine offers full access to its internal representation and dynamics. Besides efficient simulators, we also provide a lightweight yet powerful Reinforcement Learning framework. This framework can host most existing RL algorithms. In this open source release, we have provided state-of-the-art actor-critic algorithms, written in PyTorch.

Tutorials

See here.

Install scripts

You need to have cmake >= 3.8, gcc >= 4.9 and tbb (linux libtbb-dev) in order to install this script successfully.

# Download miniconda and install. 
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O $HOME/miniconda.sh
/bin/bash $HOME/miniconda.sh -b
$HOME/miniconda3/bin/conda update -y --all python=3

# Add the following to ~/.bash_profile (if you haven't already) and source it:
export PATH=$HOME/miniconda3/bin:$PATH

# Create a new conda environment and install the necessary packages:
conda create -n elf python=3
source activate elf
# If you use cuda 8.0
# conda install pytorch cuda80 -c soumith
conda install pytorch -c soumith 

pip install --upgrade pip
pip install msgpack_numpy
conda install tqdm
conda install libgcc

# Install cmake >= 3.8, gcc >= 4.9 and libtbb-dev
# This is platform-dependent.

# Clone and build the repository:
cd ~
git clone https://github.com/facebookresearch/ELF
cd ELF/rts/
mkdir build && cd build
cmake .. -DPYTHON_EXECUTABLE=$HOME/miniconda3/bin/python
make

# Train the model
cd ../..
sh ./train_minirts.sh --gpu 0

Supported Environments

Any game with C/C++ interface can be plugged into this framework by writing a simple wrapper. Currently we have the following environment:

  1. MiniRTS and its extensions (./rts)
    A miniature real-time strategy game that captures the key dynamics of its genre, including building workers, collecting resources, exploring unseen territories, defend the enemy and attack them back. The game runs extremely fast (40K FPS per core on a laptop) to faciliate the usage of many existing on-policy reinforcement learning approaches.

  2. Atari games (./atari)
    We incorporate Arcade Learning Environment (ALE) into ELF so that you can load any rom and run 1000 concurrent game instances easily.

  3. Go engine (./go)
    We reimplement our DarkForest Go engine in ELF platform. Now you can easily load a bunch of .sgf files and train your own Go AI with minimal resource requirements (i.e., a single GPU plus a week).

Reference

When you use ELF, please reference the paper with the following BibTex entry:

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, C. Lawrence Zitnick
NIPS 2017

@article{tian2017elf, 
  title={ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games},
  author={Yuandong Tian and Qucheng Gong and Wenling Shang and Yuxin Wu and C. Lawrence Zitnick},
  journal={Advances in Neural Information Processing Systems (NIPS)},
  year={2017}
}

Relevant Materials

Slides in ICML Video Games and Machine Learning (VGML) workshop.

Demo. Top-left is trained bot while bottom-right is rule-based bot.

Documentation

Check here for detailed documentation. You can also compile your version in ./doc using sphinx.

Basic Usage

ELF is very easy to use. The initialization looks like the following:

# We run 1024 games concurrently.
num_games = 1024

# Wait for a batch of 256 games.
batchsize = 256  

# The return states contain key 's', 'r' and 'terminal'
# The reply contains key 'a' to be filled from the Python side.
# The definitions of the keys are in the wrapper of the game.  
input_spec = dict(s='', r='', terminal='')
reply_spec = dict(a='')

context = Init(num_games, batchsize, input_spec, reply_spec)

The main loop is also very simple:

# Start all game threads and enter main loop.
context.Start()  
while True:
    # Wait for a batch of game states to be ready
    # These games will be blocked, waiting for replies.
    batch = context.Wait()

    # Apply a model to the game state. The output has key 'pi'
    # You can do whatever you want here. E.g., applying your favorite RL algorithms.
    output = model(batch)

    # Sample from the output to get the actions of this batch.
    reply['a'][:] = SampleFromDistribution(output)

    # Resume games.
    context.Steps()   

# Stop all game threads.
context.Stop()  

Please check train.py and eval.py for actual runnable codes.

Dependency

C++ compiler with C++11 support (e.g., gcc >= 4.9) is required. The following libraries are required tbb. CMake >=3.8 is also required.

Python 3.x is required. In addition, you need to install following package: PyTorch version 0.2.0+, tqdm, zmq, msgpack, msgpack_numpy

How to train

To train a model for MiniRTS, please first compile ./rts/game_MC (See the instruction in ./rts/ using cmake). Note that a compilation of ./rts/backend is not necessary for training, unless you want to see visualization.

Then please run the following commands in the current directory (you can also reference train_minirts.sh):

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \ 
python3 train.py 
    --num_games 1024 --batchsize 128                                                                  # Set number of games to be 1024 and batchsize to be 128.  
    --freq_update 50                                                                                  # Update behavior policy after 50 updates of the model.
    --players "fs=50,type=AI_NN,args=backup/AI_SIMPLE|delay/0.99|start/500;fs=20,type=AI_SIMPLE"      # Specify AI and its opponent, separated by semicolon. `fs` is frameskip that specifies How often your opponent makes a decision (e.g., fs=20 means it acts every 20 ticks)
                                                                                                      # If `backup` is specified in `args`, then we use rule-based AI for the first `start` ticks, then trained AI takes over. `start` decays with rate `decay`. 
    --tqdm                                                                  # Show progress bar.
    --gpu 0                                                                 # Use first gpu. If you don't specify gpu, it will run on CPUs. 
    --T 20                                                                  # 20 step actor-critic
    --additional_labels id,last_terminal         
    --trainer_stats winrate                                                 # If you want to see the winrate over iterations. 
                                                                            # Note that the winrate is computed when the action is sampled from the multinomial distribution (not greedy policy). 
                                                                            # To evaluate your model more accurately, please use eval.py.

Note that long horizon (e.g., --T 20) could make the training much faster and (at the same time) stable. With long horizon, you should be able to train it to 70% winrate within 12 hours with 16CPU and 1GPU. You can control the number of CPUs used in the training using taskset -c.

Here is one trained model with 80% winrate against AI_SIMPLE for frameskip=50. Here is one game replay.

The following is a sample output during training:

Version:  bf1304010f9609b2114a1adff4aa2eb338695b9d_staged
Num Actions:  9
Num unittype:  6
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5000/5000 [01:35<00:00, 52.37it/s]
[2017-07-12 09:04:13.212017][128] Iter[0]:
Train count: 820/5000, actor count: 4180/5000
Save to ./
Filename = ./save-820.bin
Command arguments run.py --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500 --latest_start_decay 0.99 --num_games 1024 --opponent_type AI_SIMPLE --tqdm
0:acc_reward[4100]: avg: -0.34079, min: -0.58232[1580], max: 0.25949[185]
0:cost[4100]: avg: 2.15912, min: 1.97886[2140], max: 2.31487[1173]
0:entropy_err[4100]: avg: -2.13493, min: -2.17945[438], max: -2.04809[1467]
0:init_reward[820]: avg: -0.34093, min: -0.56980[315], max: 0.26211[37]
0:policy_err[4100]: avg: 2.16714, min: 1.98384[1520], max: 2.31068[1176]
0:predict_reward[4100]: avg: -0.33676, min: -1.36083[1588], max: 0.39551[195]
0:reward[4100]: avg: -0.01153, min: -0.13281[1109], max: 0.04688[124]
0:rms_advantage[4100]: avg: 0.15646, min: 0.02189[800], max: 0.79827[564]
0:value_err[4100]: avg: 0.01333, min: 0.00024[800], max: 0.06569[1549]

 86%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                    | 4287/5000 [01:23<00:15, 46.97it/s]

To evaluate a model for MiniRTS, try the following command (you can also reference eval_minirts.sh):

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \ 
python3 eval.py 
    --load [your model]
    --batchsize 128 
    --players "fs=50,type=AI_NN;fs=20,type=AI_SIMPLE"  
    --num_games 1024 
    --num_eval 10000
    --tqdm                          # Nice progress bar
    --gpu 0                         # Use GPU 0 as the evaluation gpu.
    --additional_labels id          # Tell the game environment to output additional dict entries.
    --greedy                        # Use greedy policy to evaluate your model. If not specified, then it will sample from the action distributions. 

Here is an example output (it takes 1 min 40 seconds to evaluate 10k games with 12 CPUs):

Version:  dc895b8ea7df8ef7f98a1a031c3224ce878d52f0_
Num Actions:  9
Num unittype:  6
Load from ./save-212808.bin
Version:  dc895b8ea7df8ef7f98a1a031c3224ce878d52f0_
Num Actions:  9
Num unittype:  6
100%|████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [01:40<00:00, 99.94it/s]
str_acc_win_rate: Accumulated win rate: 0.735 [7295/2628/9923]
best_win_rate: 0.7351607376801297
new_record: True
count: 0
str_win_rate: [0] Win rate: 0.735 [7295/2628/9923], Best win rate: 0.735 [0]
Stop all game threads ...

SelfPlay

Try the following script if you want to do self-play in Minirts. It will start with two bots, both starting with the pre-trained model. One bot will be trained over time, while the other is held fixed. If you just want to check their winrate without training, try --actor_only.

sh ./selfplay_minirts.sh [your pre-trained model] 

Visualization

To visualize a trained bot, you can specify --save_replay_prefix [replay_file_prefix] when running eval.py to save (lots of) replays. Note that the same flag can also be applied to training/selfplay.

All replay files contain action sequences, are in .rep and should reproduce the exact same game when loaded. To load the replay in the command line, using the following:

./minirts-backend replay --load_replay [your replay] --vis_after 0

and open the webpage ./rts/frontend/minirts.html to check the game. To load and run the replay in the command line only (e.g, if you just want to quickly see who win the game), try:

./minirts-backend replay_cmd --load_replay [your replay]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].