All Projects → younggyoseo → pytorch-nfsp

younggyoseo / pytorch-nfsp

Licence: other
Implementation of Deep Reinforcement Learning from Self-Play in Imperfect-Information Games (Heinrich and Silver, 2016)

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to pytorch-nfsp

Multi-agent-reinforcement-learning
Implementation of Multi-Agent Reinforcement Learning algorithm(s). Currently includes: MADDPG
Stars: ✭ 49 (+53.13%)
Mutual labels:  marl
pymarl2
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
Stars: ✭ 311 (+871.88%)
Mutual labels:  marl
Mava
A library of multi-agent reinforcement learning components and systems
Stars: ✭ 355 (+1009.38%)
Mutual labels:  marl
CDS
[NeurIPS 2021] CDS achieves remarkable success in challenging benchmarks SMAC and GRF by balancing sharing and diversity.
Stars: ✭ 55 (+71.88%)
Mutual labels:  marl
MARL-resources-collection
A Collection of Multi-Agent Reinforcement Learning (MARL) Resources
Stars: ✭ 96 (+200%)
Mutual labels:  marl
marltoolbox
A toolbox with the goal of speeding up research on bargaining in MARL (cooperation problems in MARL).
Stars: ✭ 25 (-21.87%)
Mutual labels:  marl

pytorch-nfsp

An implementation of Deepmind's Deep Reinforcement Learning from Self-Play in Imperfect-Information Games (Heinrich and Silver, 2016) with LaserTag-v0. The paper introduces Neural Fictitious Self-Play(NFSP) which is a deep-learning version of FSP in Fictitious Self-Play in Extensive-Form Games (Heinrich et al. 2015).

Requirements

pytorch 0.4
gym
lasertag

Examples

python main.py --env 'LaserTag-small4-v0' for training

python main.py --env 'LaserTag-small2-v0' --render if you want to watch rendered game on screen.

python main.py --env 'LaserTag-small2-v0' --render --evaluate if you want to evaluate/enjoy the model which is already trained. I included models/LaserTag-small*-v0-dqn-model.pth so you can see how trained agents play against each other.

For more details, See arguments.py

LaserTag-small2-v0

small2.gif

LaserTag-small3-v0

small3.gif

LaserTag-small4-v0

small4.gif

Agents are trained with NFSP. Agents get a unit reward for touching other agent with laser beam. If an agent is hit twice then that agent will be sent to random respawn place. An episode consists of 1000 frames.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].