All Projects → minqi → Learning To Communicate Pytorch

minqi / Learning To Communicate Pytorch

Licence: apache-2.0
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Learning To Communicate Pytorch

Ml In Tf
Get started with Machine Learning in TensorFlow with a selection of good reads and implemented examples!
Stars: ✭ 45 (-80.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, recurrent-neural-networks, dqn, deepmind
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-1.27%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, rl
Reinforcement Learning
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Stars: ✭ 3,329 (+1310.59%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, deepmind
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (+87.29%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn, rl
Torch Ac
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Stars: ✭ 70 (-70.34%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, recurrent-neural-networks
Mujocounity
Reproducing MuJoCo benchmarks in a modern, commercial game /physics engine (Unity + PhysX).
Stars: ✭ 47 (-80.08%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, deepmind
Muzero General
MuZero
Stars: ✭ 1,187 (+402.97%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, rl
Rlenv.directory
Explore and find reinforcement learning environments in a list of 150+ open source environments.
Stars: ✭ 79 (-66.53%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, rl
Torchrl
Highly Modular and Scalable Reinforcement Learning
Stars: ✭ 102 (-56.78%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Reinforcement Learning
🤖 Implements of Reinforcement Learning algorithms.
Stars: ✭ 104 (-55.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (-3.39%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+1172.88%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Ros2learn
ROS 2 enabled Machine Learning algorithms
Stars: ✭ 119 (-49.58%)
Mutual labels:  reinforcement-learning, dqn, rl
Deep Q Learning
Minimal Deep Q Learning (DQN & DDQN) implementations in Keras
Stars: ✭ 1,013 (+329.24%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Recurrent Environment Simulators
Deepmind Recurrent Environment Simulators paper implementation in tensorflow
Stars: ✭ 73 (-69.07%)
Mutual labels:  reinforcement-learning, recurrent-neural-networks, deepmind
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+283.05%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Tensorflow Tutorial
TensorFlow and Deep Learning Tutorials
Stars: ✭ 748 (+216.95%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, recurrent-neural-networks
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+769.07%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn
Pysc2 Examples
StarCraft II - pysc2 Deep Reinforcement Learning Examples
Stars: ✭ 722 (+205.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, deepmind
Pytorch Rl
Deep Reinforcement Learning with pytorch & visdom
Stars: ✭ 745 (+215.68%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, dqn

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

This is a PyTorch implementation of the original Lua code release.

Overview

This codebase implements two approaches to learning discrete communication protocols for playing collaborative games: Reinforced Inter-Agent Learning (RIAL), in which agents learn a factorized deep Q-learning policy across game actions and messages, and Differentiable Inter-Agent Learning (DIAL), in which the message vectors are directly learned by backpropagating errors through a noisy communication channel during training, and discretized to binary vectors during test time. While RIAL and DIAL share the same individual network architecture, one would expect learning to be more efficient under DIAL, which directly backpropagates downstream errors during training, a fact that is verified in comparing the performance of the two approaches.

Execution

$ virtualenv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt
$ python main.py -c config/switch_3_dial.json

Results for switch game

DIAL vs. RIAL reward curves

This chart was generated by plotting an exponentially-weighted average across 20 trials for each curve.

More info

More generally, main.py takes multiple arguments:

Arg Short Description Required?
--config_path -c path to JSON configuration file
--results_path -r path to directory in which to save results per trial (as csv) -
--ntrials -n number of trials to run -
--start_index -s start-index used as suffix in result filenames -
--verbose -v prints results per training epoch to stdout if set -
Configuration

JSON configuration files passed to main.py should consist of the following key-value pairs:

Key Description Type
game name of the game, e.g. "switch" string
game_nagents number of agents int
game_action_space number of valid game actions int
game_comm_limited true if only some agents can communicate at each step bool
game_comm_bits number of bits per message int
game_comm_sigma standard deviation of Gaussian noise applied by DRU float
game_comm_hard true if use hard discretization, soft approximation otherwise bool
nsteps maximum number of game steps int
gamma reward discount factor for Q-learning float
model_dial true if agents should use DIAL bool
model_comm_narrow true if DRU should use sigmoid for regularization, softmax otherwise bool
model_target true if learning should use a target Q-network bool
model_bn true if learning should use batch normalization bool
model_know_share true if agents should share parameters bool
model_action_aware true if each agent should know their last action bool
model_rnn_size dimension of rnn hidden state int
bs batch size of episodes, run in parallel per epoch int
learningrate learning rate for optimizer (RMSProp) float
momentum momentum for optimizer (RMSProp) float
eps exploration rate for epsilon-greedy exploration float
nepisodes number of epochs, each consisting of parallel episodes int
step_test perform a test episode every this many steps int
step_target update target network every this many steps int
Visualizing results

You can use analyze_results.py to graph results output by main.py. This script will plot the average results across all csv files per path specified after -r. Further, -a can take an alpha value to plot results as exponentially-weighted moving averages, and -l takes an optional list of labels corresponding to the paths.

$ python util/analyze_results -r <paths to results> -a <weight for EWMA>

Bibtex

@inproceedings{foerster2016learning,
    title={Learning to communicate with deep multi-agent reinforcement learning},
    author={Foerster, Jakob and Assael, Yannis M and de Freitas, Nando and Whiteson, Shimon},
    booktitle={Advances in Neural Information Processing Systems},
    pages={2137--2145},
    year={2016} 
}

License

Code licensed under the Apache License v2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].