All Projects → salesforce → warp-drive

salesforce / warp-drive

Licence: BSD-3-Clause license
Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
Cuda
1817 projects

Projects that are alternatives of or similar to warp-drive

QUB DW HighPerformancePython
Code and more for the QUB Development Weeks event 'High Performance Python'
Stars: ✭ 79 (-78.3%)
Mutual labels:  numba
NCCV
Short course on computer vision and image processing using Numba+CUDA+OpenCV
Stars: ✭ 22 (-93.96%)
Mutual labels:  numba
ESA
Easy SimAuto (ESA): An easy-to-use Power System Analysis Automation Environment atop PowerWorld Simulator Automation Server (SimAuto)
Stars: ✭ 26 (-92.86%)
Mutual labels:  numba
PyBox
A box-model that automatically creates and solves equations used to describe the evolution in atmospheric composition using Python with Numba and, optionally, Fortran.
Stars: ✭ 30 (-91.76%)
Mutual labels:  numba
Batch-First
A JIT compiled chess engine which traverses the search tree in batches in a best-first manner, allowing for neural network batching, asynchronous GPU use, and vectorized CPU computations.
Stars: ✭ 27 (-92.58%)
Mutual labels:  numba
grblas
Python wrapper around GraphBLAS
Stars: ✭ 22 (-93.96%)
Mutual labels:  numba
qgs
A 2-layer quasi-geostrophic atmospheric model in Python. Can be coupled to a simple land or shallow-water ocean component.
Stars: ✭ 24 (-93.41%)
Mutual labels:  numba
marltoolbox
A toolbox with the goal of speeding up research on bargaining in MARL (cooperation problems in MARL).
Stars: ✭ 25 (-93.13%)
Mutual labels:  multiagent-reinforcement-learning
numba-dppy
Numba extension for Intel(R) XPUs
Stars: ✭ 26 (-92.86%)
Mutual labels:  numba
NumbaLSODA
Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.
Stars: ✭ 30 (-91.76%)
Mutual labels:  numba
gpu mandelbrot
Interactive Mandelbrot set on GPU with Python
Stars: ✭ 33 (-90.93%)
Mutual labels:  numba
transonic
🚀 Make your Python code fly at transonic speeds!
Stars: ✭ 93 (-74.45%)
Mutual labels:  numba
PySDM
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab
Stars: ✭ 26 (-92.86%)
Mutual labels:  numba
dataiter
Python classes for data manipulation
Stars: ✭ 25 (-93.13%)
Mutual labels:  numba
RRMPG
Rainfall-Runoff modelling playground
Stars: ✭ 56 (-84.62%)
Mutual labels:  numba
codex-africanus
Radio Astronomy Algorithms Library
Stars: ✭ 13 (-96.43%)
Mutual labels:  numba
antropy
AntroPy: entropy and complexity of (EEG) time-series in Python
Stars: ✭ 111 (-69.51%)
Mutual labels:  numba
atomate2
atomate2 is a library of computational materials science workflows
Stars: ✭ 67 (-81.59%)
Mutual labels:  high-throughput
Stumpy
STUMPY is a powerful and scalable Python library for modern time series analysis
Stars: ✭ 2,019 (+454.67%)
Mutual labels:  numba
numbsql
Run Numba compiled functions into SQLite
Stars: ✭ 34 (-90.66%)
Mutual labels:  numba

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single or multiple GPUs (Graphics Processing Unit).

Using the extreme parallelization capability of GPUs, WarpDrive enables orders-of-magnitude faster RL compared to CPU simulation + GPU model implementations. It is extremely efficient as it avoids back-and-forth data copying between the CPU and the GPU, and runs simulations across multiple agents and multiple environment replicas in parallel.

We have some main updates since its initial open source,

  • version 1.3: provides the auto scaling tools to achieve the optimal throughput per device.
  • version 1.4: supports the distributed asynchronous training among multiple GPU devices.
  • version 1.6: supports the aggregation of multiple GPU blocks for one environment replica.
  • version 2.0: supports the dual backends of both CUDA C and JIT compiled Numba. (Our Blog article)

Together, these allow the user to run thousands of concurrent multi-agent simulations and train on extremely large batches of experience, achieving over 100x throughput over CPU-based counterparts.

We include several default multi-agent environments based on the game of "Tag" for benchmarking and testing. In the "Tag" games, taggers are trying to run after and tag the runners. They are fairly complicated games where thread synchronization, shared memory, high-dimensional indexing for thousands of interacting agents are involved. Several much more complex environments such as Covid-19 environment and climate change environment have been developed based on WarpDrive, you may see examples in Real-World Problems and Collaborations.

Below, we show multi-agent RL policies trained for different tagger:runner speed ratios using WarpDrive. These environments can run at millions of steps per second, and train in just a few hours, all on a single GPU!

WarpDrive also provides tools to build and train multi-agent RL systems quickly with just a few lines of code. Here is a short example to train tagger and runner agents:

# Create a wrapped environment object via the EnvWrapper
# Ensure that env_backend is set to 'pycuda' or 'numba' (in order to run on the GPU)
env_wrapper = EnvWrapper(
    TagContinuous(**run_config["env"]),
    num_envs=run_config["trainer"]["num_envs"], 
    env_backend="pycuda"
)

# Agents can share policy models: this dictionary maps policy model names to agent ids.
policy_tag_to_agent_id_map = {
    "tagger": list(env_wrapper.env.taggers),
    "runner": list(env_wrapper.env.runners),
}

# Create the trainer object
trainer = Trainer(
    env_wrapper=env_wrapper,
    config=run_config,
    policy_tag_to_agent_id_map=policy_tag_to_agent_id_map,
)

# Perform training!
trainer.train()

Below, we compare the training speed on an N1 16-CPU node versus a single A100 GPU (using WarpDrive), for the Tag environment with 100 runners and 5 taggers. With the same environment configuration and training parameters, WarpDrive on a GPU is about 10× faster. Both scenarios are with 60 environment replicas running in parallel. Using more environments on the CPU node is infeasible as data copying gets too expensive. With WarpDrive, it is possible to scale up the number of environment replicas at least 10-fold, for even faster training.

Code Structure

WarpDrive provides a CUDA (or Numba) + Python framework and quality-of-life tools, so you can quickly build fast, flexible and massively distributed multi-agent RL systems. The following figure illustrates a bottoms-up overview of the design and components of WarpDrive. The user only needs to write a CUDA or Numba step function at the CUDA environment layer, while the rest is a pure Python interface. We have step-by-step tutorials for you to master the workflow.

Papers and Citing WarpDrive

Our paper published at Journal of Machine Learning Research (JMLR) https://jmlr.org/papers/v23/22-0185.html. You can also find more details in our white paper: https://arxiv.org/abs/2108.13976.

If you're using WarpDrive in your research or applications, please cite using this BibTeX:

@article{JMLR:v23:22-0185,
  author  = {Tian Lan and Sunil Srinivasa and Huan Wang and Stephan Zheng},
  title   = {WarpDrive: Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU},
  journal = {Journal of Machine Learning Research},
  year    = {2022},
  volume  = {23},
  number  = {316},
  pages   = {1--6},
  url     = {http://jmlr.org/papers/v23/22-0185.html}
}

@misc{lan2021warpdrive,
  title={WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU}, 
  author={Tian Lan and Sunil Srinivasa and Huan Wang and Caiming Xiong and Silvio Savarese and Stephan Zheng},
  year={2021},
  eprint={2108.13976},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Tutorials and Quick Start

Tutorials

Familiarize yourself with WarpDrive by running these tutorials on Colab or NGC container!

You may also run these tutorials locally, but you will need a GPU machine with nvcc compiler installed and a compatible Nvidia GPU driver. You will also need Jupyter. See https://jupyter.readthedocs.io/en/latest/install.html for installation instructions

Example Training Script

We provide some example scripts for you to quickly start the end-to-end training. For example, if you want to train tag_continuous environment (10 taggers and 100 runners) with 2 GPUs and CUDA C backend

python example_training_script_pycuda.py -e tag_continuous -n 2

or switch to JIT compiled Numba backend with 1 GPU

python example_training_script_numba.py -e tag_continuous

You can find full reference documentation here.

Real World Problems and Collaborations

Installation Instructions

To get started, you'll need to have Python 3.7+ and the nvcc compiler installed with a compatible Nvidia GPU CUDA driver.

CUDA (which includes nvcc) can be installed by following Nvidia's instructions here: https://developer.nvidia.com/cuda-downloads.

Docker Image

V100 GPU: You can refer to the example Dockerfile to configure your system.

A100 GPU: Our latest image is published and maintained by NVIDIA NGC. We recommend you download the latest image from NGC catalog.

If you want to build your customized environment, we suggest you visit Nvidia Docker Hub to download the CUDA and cuDNN images compatible with your system. You should be able to use the command line utility to monitor the NVIDIA GPU devices in your system:

nvidia-smi

and see something like this

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P0    32W / 300W |      0MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In this snapshot, you can see we are using a Tesla V100 GPU and CUDA version 11.0.

Installing using Pip

You can install WarpDrive using the Python package manager:

pip install rl_warp_drive

Installing from Source

  1. Clone this repository to your machine:

    git clone www.github.com/salesforce/warp-drive
    
  2. Optional, but recommended for first tries: Create a new conda environment (named "warp_drive" below) and activate it:

    conda create --name warp_drive python=3.7 --yes
    conda activate warp_drive
    
  3. Install as an editable Python package:

    cd warp_drive
    pip install -e .
    

Testing your Installation

You can call directly from Python command to test all modules and the end-to-end training workflow.

python warp_drive/utils/unittests/run_unittests_pycuda.py
python warp_drive/utils/unittests/run_unittests_numba.py
python warp_drive/utils/unittests/run_trainer_tests.py

Learn More

For more information, please check out our blog, white paper, and code documentation.

If you're interested in extending this framework, or have questions, join the AI Economist Slack channel using this invite link.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].