All Projects → caslab-vt → SARNet

caslab-vt / SARNet

Licence: MIT License
Code repository for SARNet: Learning Multi-Agent Communication through Structured Attentive Reasoning (NeurIPS 2020)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SARNet

Deterministic Gail Pytorch
PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning
Stars: ✭ 44 (+214.29%)
Mutual labels:  deep-reinforcement-learning, gym
Pytorch sac
PyTorch implementation of Soft Actor-Critic (SAC)
Stars: ✭ 174 (+1142.86%)
Mutual labels:  deep-reinforcement-learning, gym
Muzero General
MuZero
Stars: ✭ 1,187 (+8378.57%)
Mutual labels:  deep-reinforcement-learning, gym
Paac.pytorch
Pytorch implementation of the PAAC algorithm presented in Efficient Parallel Methods for Deep Reinforcement Learning https://arxiv.org/abs/1705.04862
Stars: ✭ 22 (+57.14%)
Mutual labels:  deep-reinforcement-learning, gym
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (+285.71%)
Mutual labels:  deep-reinforcement-learning, gym
Rl algos
Reinforcement Learning Algorithms
Stars: ✭ 14 (+0%)
Mutual labels:  deep-reinforcement-learning, gym
Pytorch sac ae
PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)
Stars: ✭ 94 (+571.43%)
Mutual labels:  deep-reinforcement-learning, gym
Drq
DrQ: Data regularized Q
Stars: ✭ 268 (+1814.29%)
Mutual labels:  deep-reinforcement-learning, gym
reinforcement learning ppo rnd
Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some explanation
Stars: ✭ 33 (+135.71%)
Mutual labels:  deep-reinforcement-learning, gym
omd
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"
Stars: ✭ 43 (+207.14%)
Mutual labels:  deep-reinforcement-learning, gym
Deepdrive
Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving
Stars: ✭ 628 (+4385.71%)
Mutual labels:  deep-reinforcement-learning, gym
wolpertinger ddpg
Wolpertinger Training with DDPG (Pytorch), Deep Reinforcement Learning in Large Discrete Action Spaces. Multi-GPU/Singer-GPU/CPU compatible.
Stars: ✭ 44 (+214.29%)
Mutual labels:  deep-reinforcement-learning, gym
Rl Book
Source codes for the book "Reinforcement Learning: Theory and Python Implementation"
Stars: ✭ 464 (+3214.29%)
Mutual labels:  deep-reinforcement-learning, gym
Drlkit
A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms
Stars: ✭ 29 (+107.14%)
Mutual labels:  deep-reinforcement-learning, gym
Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (+2714.29%)
Mutual labels:  deep-reinforcement-learning, gym
Rlenv.directory
Explore and find reinforcement learning environments in a list of 150+ open source environments.
Stars: ✭ 79 (+464.29%)
Mutual labels:  deep-reinforcement-learning, gym
Gym Gazebo2
gym-gazebo2 is a toolkit for developing and comparing reinforcement learning algorithms using ROS 2 and Gazebo
Stars: ✭ 257 (+1735.71%)
Mutual labels:  deep-reinforcement-learning, gym
Naf Tensorflow
"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Stars: ✭ 192 (+1271.43%)
Mutual labels:  deep-reinforcement-learning, gym
rl-medical
Communicative Multiagent Deep Reinforcement Learning for Anatomical Landmark Detection using PyTorch.
Stars: ✭ 36 (+157.14%)
Mutual labels:  deep-reinforcement-learning, multiagent-reinforcement-learning
mgym
A collection of multi-agent reinforcement learning OpenAI gym environments
Stars: ✭ 41 (+192.86%)
Mutual labels:  gym, multiagent-reinforcement-learning

Structured Attentive Reasoning Network (SARNet)

Code repository for Learning Multi-Agent Communication through Structured Attentive Reasoning

Cite

If you use this code please consider citing SARNet

@inproceedings{rangwala2020learning,
 author = {Rangwala, Murtaza and Williams, Ryan},
 booktitle = {Advances in Neural Information Processing Systems},
 pages = {10088--10098},
 title = {Learning Multi-Agent Communication through Structured Attentive Reasoning},
 url = {https://proceedings.neurips.cc/paper/2020/file/72ab54f9b8c11fae5b923d7f854ef06a-Paper.pdf},
 volume = {33},
 year = {2020}
}

Installation

  • To install, cd into the root directory and type pip install -e .

  • Known dependencies: Python (3.5.4+), OpenAI gym (0.10.5), tensorflow (1.14.0)

Install my implementation of [Multi-Agent Particle Environments (MPE)] included in this repository. (https://github.com/openai/multiagent-particle-envs), given in the repository

  • cd into multiagent-particle-envs and type pip install -e .

Install my implementation of [Traffic Junction] included in this repository. (https://github.com/IC3Net/IC3Net/tree/master/ic3net-envs), given in the repository

  • cd into ic3net-envs and type python setup.py develop

Architectures Implemented

Use the following architecture names for --adv-test and --good-test, to define the agents communication. Adversarial agents are the default agents for fully-cooperative environments, i.e. good agents are only used for competing environments.

  • SARNet: --adv-test SARNET or --good-test SARNET

  • TarMAC: --adv-test TARMAC or --good-test TARMAC

  • CommNet: --adv-test COMMNET or --good-test COMMNET

  • IC3Net: --adv-test IC3NET or --good-test IC3NET

  • MADDPG: --adv-test DDPG or --good-test DDPG

To use MAAC-type Critic

  • MAAC: --adv-critic-model MAAC or --gd-critic-model MAAC

Environments

For multi-agent particle environment: Parse the following arguments --env-type: takes in the following environment arguments.

  • Multi-Agent Particle Environemt: mpe

'--scenario': takes in the following environment arguments. For multi-agent particle environment use the following:

  • Predator-Prey with 3 vs 1: simple_tag_3
  • Predator-Prey with 6 vs 2: simple_tag_6
  • Predator-Prey with 12 vs 4: simple_tag_12
  • Predator-Prey with 15 vs 5: simple_tag_15
  • Cooperative Navigation with 3 agents: simple_spread_3
  • Cooperative Navigation with 6 agents: simple_spread_6
  • Cooperative Navigation with 10 agents: simple_spread_10
  • Cooperative Navigation with 20 agents: simple_spread_20
  • Physical Deception with 3 vs 1: simple_adversary_3
  • Physical Deception with 4 vs 2: simple_adversary_6
  • Physical Deception with 12 vs 4 agents: simple_adversary_12

For Traffic Junction -

  • Traffic Junction: --env-type ic3net --scenario traffic-junction

Specifying Number of Agents

Number of cooperating agents can be specified by --num-adversaries. For environments with competing agents, the code automatically accounts for the remaining "good" agents.

Training Policies

We support training through DDPG for continuous action spaces and REINFORCE for discrete action spaces. Parse the following arguments:

  • --policy-grad maddpg for continuous action spaces
  • --policy-grad reinforce for discrete action spaces

Additionally, in order to enable TD3, and recurrent trajectory updates use, --td3 and specify the trajectory length to make updates over by --len-traj-update 10

Recurrent Importance Sampling is enabled by --PER-sampling

Example Scripts

  • Cooperative Navigation with 6 SARNet Agents: python train.py --policy-grad maddpg --env-type mpe --scenario simple_spread_6 --num_adversaries 6 --key-units 32 --value-units 32 --query-units 32 --len-traj-update 10 --td3 --PER-sampling --encoder-model LSTM --max-episode-len 100

  • Traffic Junction with 6 SARNet Agents: python train.py --env-type ic3net --scenario traffic_junction --policy-grad reinforce --num-adversaries 6 --adv-test SARNET --gpu-device 0 --exp-name SAR-TJ6-NoCurrLr --max-episode-len 20 --num-env 50 --dim 6 --add_rate_min 0.3 --add_rate_max 0.3 --curr_start 250 --curr_end 1250 --num-episodes 500000 --batch-size 500 --difficulty easy --vision 0 --batch-size 500

References

Theano based abstractions from Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.

Segment Tree for PER OpenAI Baselines

Attention Based Abstractions/Operations MAC Network

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].