All Projects → cycraig → MP-DQN

cycraig / MP-DQN

Licence: MIT license
Source code for the dissertation: "Multi-Pass Deep Q-Networks for Reinforcement Learning with Parameterised Action Spaces"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to MP-DQN

Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020
Live Trading. Please star.
Stars: ✭ 1,251 (+1163.64%)
Mutual labels:  deep-reinforcement-learning
LWDRLC
Lightweight deep RL Libraray for continuous control.
Stars: ✭ 14 (-85.86%)
Mutual labels:  deep-reinforcement-learning
minerva
An out-of-the-box GUI tool for offline deep reinforcement learning
Stars: ✭ 80 (-19.19%)
Mutual labels:  deep-reinforcement-learning
decentralized-rl
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)
Stars: ✭ 40 (-59.6%)
Mutual labels:  deep-reinforcement-learning
chi
A high-level framework for advanced deep learning with TensorFlow
Stars: ✭ 55 (-44.44%)
Mutual labels:  deep-reinforcement-learning
drift drl
High-speed Autonomous Drifting with Deep Reinforcement Learning
Stars: ✭ 82 (-17.17%)
Mutual labels:  deep-reinforcement-learning
motion-planner-reinforcement-learning
End to end motion planner using Deep Deterministic Policy Gradient (DDPG) in gazebo
Stars: ✭ 99 (+0%)
Mutual labels:  deep-reinforcement-learning
Carla-ppo
This repository hosts a customized PPO based agent for Carla. The goal of this project is to make it easier to interact with and experiment in Carla with reinforcement learning based agents -- this, by wrapping Carla in a gym like environment that can handle custom reward functions, custom debug output, etc.
Stars: ✭ 122 (+23.23%)
Mutual labels:  deep-reinforcement-learning
FinRL Podracer
Cloud-native Financial Reinforcement Learning
Stars: ✭ 179 (+80.81%)
Mutual labels:  deep-reinforcement-learning
AI
使用深度强化学习解决视觉跟踪和视觉导航问题
Stars: ✭ 16 (-83.84%)
Mutual labels:  deep-reinforcement-learning
mmn
Moore Machine Networks (MMN): Learning Finite-State Representations of Recurrent Policy Networks
Stars: ✭ 39 (-60.61%)
Mutual labels:  deep-reinforcement-learning
deep-rts
A Real-Time-Strategy game for Deep Learning research
Stars: ✭ 152 (+53.54%)
Mutual labels:  deep-reinforcement-learning
imitation learning
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Stars: ✭ 93 (-6.06%)
Mutual labels:  deep-reinforcement-learning
pomdp-baselines
Simple (but often Strong) Baselines for POMDPs in PyTorch - ICML 2022
Stars: ✭ 162 (+63.64%)
Mutual labels:  deep-reinforcement-learning
alpha sigma
A pytorch based Gomoku game model. Alpha Zero algorithm based reinforcement Learning and Monte Carlo Tree Search model.
Stars: ✭ 134 (+35.35%)
Mutual labels:  deep-reinforcement-learning
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-66.67%)
Mutual labels:  deep-reinforcement-learning
multi view ram
No description or website provided.
Stars: ✭ 26 (-73.74%)
Mutual labels:  deep-reinforcement-learning
code summarization public
source code for 'Improving automatic source code summarization via deep reinforcement learning'
Stars: ✭ 71 (-28.28%)
Mutual labels:  deep-reinforcement-learning
DeepLearningFlappyFrog
Flappy Frog hack using Deep Reinforcement Learning (Deep Q-learning). 暴力膜蛤不可取。
Stars: ✭ 16 (-83.84%)
Mutual labels:  deep-reinforcement-learning
Meta-Learning-for-StarCraft-II-Minigames
We reproduced DeepMind's results and implement a meta-learning (MLSH) agent which can generalize across minigames.
Stars: ✭ 26 (-73.74%)
Mutual labels:  deep-reinforcement-learning

Multi-Pass Deep Q-Networks

This repository includes several reinforcement learning algorithms for parameterised action space MDPs:

  1. P-DQN [Xiong et al. 2018]

  2. PA-DDPG [Hausknecht & Stone 2016]

  3. Q-PAMDP [Masson et al. 2016]

Multi-Pass Deep Q-Networks (MP-DQN) fixes the over-paramaterisation problem of P-DQN by splitting the action-parameter inputs to the Q-network using several passes (in a parallel batch). Split Deep Q-Networks (SP-DQN) is a much slower solution which uses multiple Q-networks with/without shared feature-extraction layers. A weighted-indexed action-parameter loss function is also provided for P-DQN.

Dependencies

  • Python 3.5+ (tested with 3.5 and 3.6)
  • pytorch 0.4.1 (1.0+ should work but will be slower)
  • gym 0.10.5
  • numpy
  • click

Domains

Experiment scripts are provided to run each algorithm on the following domains with parameterised actions:

The simplest installation method for the above OpenAI Gym environments is as follows:

pip install -e git+https://github.com/cycraig/gym-platform#egg=gym_platform
pip install -e git+https://github.com/cycraig/gym-goal#egg=gym_goal
pip install -e git+https://github.com/cycraig/gym-soccer#egg=gym_soccer 

If something goes wrong, follow the installation instructions given by the repositories above. Note that gym-soccer has been updated for a later gym version and the reward function changed to reflect the one used in the code by Hausknecht & Stone [2016] (https://github.com/mhauskn/dqn-hfo). So use the one linked above rather than the OpenAI repository.

Example Usage

Each run file has default flags in place, view the run_domain_algorithm.py files for more information. The click flags are configured to make it easier to run experiments and hyper-parameter searches in batches, which is better for scripts but makes it more annoying to type out.

To run vanilla P-DQN on the Platform domain with default flags:

python run_platform_pdqn.py 

SP-DQN on the Robot Soccer Goal domain, rendering each episode:

python run_goal_pdqn.py --split True --visualise True --render-freq 1

MP-DQN on Half Field Offense with four hidden layers (note no spaces) and the weighted-indexed loss function:

python run_soccer_pdqn.py --multipass True --layers [1024,512,256,128] --weighted True --indexed True

Citing

If this repository has helped your research, please cite the following:

@article{bester2019mpdqn,
	author    = {Bester, Craig J. and James, Steven D. and Konidaris, George D.},
	title     = {Multi-Pass {Q}-Networks for Deep Reinforcement Learning with Parameterised Action Spaces},
	journal   = {arXiv preprint arXiv:1905.04388},
	year      = {2019},
	archivePrefix = {arXiv},
	eprinttype    = {arxiv},
	eprint    = {1905.04388},
	url       = {http://arxiv.org/abs/1905.04388},
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].