Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → openai → Maddpg

openai / Maddpg

Licence: mit

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Programming Languages

python

139335 projects - #7 most used programming language

Labels

paper

Projects that are alternatives of or similar to Maddpg

Bert paper chinese translation

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 论文的中文翻译 Chinese Translation!

Stars: ✭ 564 (-26.47%)

Mutual labels: paper

Awesome Relation Extraction

📖 A curated list of awesome resources dedicated to Relation Extraction, one of the most important tasks in Natural Language Processing (NLP).

Stars: ✭ 656 (-14.47%)

Mutual labels: paper

Random Network Distillation

Code for the paper "Exploration by Random Network Distillation"

Stars: ✭ 708 (-7.69%)

Mutual labels: paper

Pl Compiler Resource

程序语言与编译技术相关资料（持续更新中）

Stars: ✭ 578 (-24.64%)

Mutual labels: paper

All About The Gan

All About the GANs(Generative Adversarial Networks) - Summarized lists for GAN

Stars: ✭ 630 (-17.86%)

Mutual labels: paper

Multiagent Competition

Code for the paper "Emergent Complexity via Multi-agent Competition"

Stars: ✭ 663 (-13.56%)

Mutual labels: paper

Imitation

Code for the paper "Generative Adversarial Imitation Learning"

Stars: ✭ 555 (-27.64%)

Mutual labels: paper

Awesome Face

😎 face releated algorithm, dataset and paper

Stars: ✭ 739 (-3.65%)

Mutual labels: paper

Minecraftdev

Plugin for IntelliJ IDEA that gives special support for Minecraft modding projects.

Stars: ✭ 645 (-15.91%)

Mutual labels: paper

Large Scale Curiosity

Code for the paper "Large-Scale Study of Curiosity-Driven Learning"

Stars: ✭ 703 (-8.34%)

Mutual labels: paper

Dnc Tensorflow

A TensorFlow implementation of DeepMind's Differential Neural Computers (DNC)

Stars: ✭ 587 (-23.47%)

Mutual labels: paper

Awesome Interaction Aware Trajectory Prediction

A selection of state-of-the-art research materials on trajectory prediction

Stars: ✭ 625 (-18.51%)

Mutual labels: paper

Awesome Economics

A curated collection of links for economists

Stars: ✭ 688 (-10.3%)

Mutual labels: paper

Deeptype

Code for the paper "DeepType: Multilingual Entity Linking by Neural Type System Evolution"

Stars: ✭ 571 (-25.55%)

Mutual labels: paper

Paper collection

Academic papers related to fuzzing, binary analysis, and exploit dev, which I want to read or have already read

Stars: ✭ 710 (-7.43%)

Mutual labels: paper

Cv paperdaily

CV 论文笔记

Stars: ✭ 555 (-27.64%)

Mutual labels: paper

Dl Nlp Readings

My Reading Lists of Deep Learning and Natural Language Processing

Stars: ✭ 656 (-14.47%)

Mutual labels: paper

Awesome Distributed Systems

A curated list to learn about distributed systems

Stars: ✭ 7,263 (+846.94%)

Mutual labels: paper

Cv Arxiv Daily

分享计算机视觉每天的arXiv文章

Stars: ✭ 714 (-6.91%)

Mutual labels: paper

Densenet

DenseNet implementation in Keras

Stars: ✭ 693 (-9.65%)

Mutual labels: paper

View All Similar Projects ➔

Status: Archive (code is provided as-is, no updates expected)

Multi-Agent Deep Deterministic Policy Gradient (MADDPG)

This is the code for implementing the MADDPG algorithm presented in the paper: Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. It is configured to be run in conjunction with environments from the Multi-Agent Particle Environments (MPE). Note: this codebase has been restructured since the original paper, and the results may vary from those reported in the paper.

Update: the original implementation for policy ensemble and policy estimation can be found here. The code is provided as-is.

Installation

To install, cd into the root directory and type pip install -e .
Known dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.8.0), numpy (1.14.5)

Case study: Multi-Agent Particle Environments

We demonstrate here how the code can be used in conjunction with the Multi-Agent Particle Environments (MPE).

Download and install the MPE code here by following the README.
Ensure that multiagent-particle-envs has been added to your PYTHONPATH (e.g. in ~/.bashrc or ~/.bash_profile).
To run the code, cd into the experiments directory and run train.py:

python train.py --scenario simple

You can replace simple with any environment in the MPE you'd like to run.

Command-line options

Environment options

--scenario: defines which environment in the MPE is to be used (default: "simple")
--max-episode-len maximum length of each episode for the environment (default: 25)
--num-episodes total number of training episodes (default: 60000)
--num-adversaries: number of adversaries in the environment (default: 0)
--good-policy: algorithm used for the 'good' (non adversary) policies in the environment (default: "maddpg"; options: {"maddpg", "ddpg"})
--adv-policy: algorithm used for the adversary policies in the environment (default: "maddpg"; options: {"maddpg", "ddpg"})

Core training parameters

--lr: learning rate (default: 1e-2)
--gamma: discount factor (default: 0.95)
--batch-size: batch size (default: 1024)
--num-units: number of units in the MLP (default: 64)

Checkpointing

--exp-name: name of the experiment, used as the file name to save all results (default: None)
--save-dir: directory where intermediate training results and model will be saved (default: "/tmp/policy/")
--save-rate: model is saved every time this number of episodes has been completed (default: 1000)
--load-dir: directory where training state and model are loaded from (default: "")

Evaluation

--restore: restores previous training state stored in load-dir (or in save-dir if no load-dir has been provided), and continues training (default: False)
--display: displays to the screen the trained policy stored in load-dir (or in save-dir if no load-dir has been provided), but does not continue training (default: False)
--benchmark: runs benchmarking evaluations on saved policy, saves results to benchmark-dir folder (default: False)
--benchmark-iters: number of iterations to run benchmarking for (default: 100000)
--benchmark-dir: directory where benchmarking data is saved (default: "./benchmark_files/")
--plots-dir: directory where training curves are saved (default: "./learning_curves/")

Code structure

./experiments/train.py: contains code for training MADDPG on the MPE
./maddpg/trainer/maddpg.py: core code for the MADDPG algorithm
./maddpg/trainer/replay_buffer.py: replay buffer code for MADDPG
./maddpg/common/distributions.py: useful distributions used in maddpg.py
./maddpg/common/tf_util.py: useful tensorflow functions used in maddpg.py

Paper citation

If you used this code for your experiments or found it helpful, consider citing the following paper:

@article{lowe2017multi,
  title={Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments},
  author={Lowe, Ryan and Wu, Yi and Tamar, Aviv and Harb, Jean and Abbeel, Pieter and Mordatch, Igor},
  journal={Neural Information Processing Systems (NIPS)},
  year={2017}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 767

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (31) 🔗