Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → huangwl18 → Modular Rl

huangwl18 / Modular Rl

Licence: other

[ICML 2020] PyTorch Code for "One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control"

Labels

jupyter-notebook deep-learning reinforcement-learning message-passing

Projects that are alternatives of or similar to Modular Rl

MAGNet: Multi-agents control using Graph Neural Networks

Stars: ✭ 88 (-30.16%)

Mutual labels: jupyter-notebook, reinforcement-learning

Exercise Solutions for Reinforcement Learning: An Introduction [2nd Edition]

Stars: ✭ 97 (-23.02%)

Mutual labels: jupyter-notebook, reinforcement-learning

60 days rl challenge

60_Days_RL_Challenge中文版

Stars: ✭ 92 (-26.98%)

Mutual labels: jupyter-notebook, reinforcement-learning

Reinforcement Learning

Reinforcement learning material, code and exercises for Udacity Nanodegree programs.

Stars: ✭ 77 (-38.89%)

Mutual labels: jupyter-notebook, reinforcement-learning

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-6.35%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tools for using computer algebra systems to solve math problems step-by-step with reinforcement learning

Stars: ✭ 79 (-37.3%)

Mutual labels: jupyter-notebook, reinforcement-learning

Learning human driver models from NGSIM data with imitation learning.

Stars: ✭ 96 (-23.81%)

Mutual labels: jupyter-notebook, reinforcement-learning

Some notebooks

Stars: ✭ 53 (-57.94%)

Mutual labels: jupyter-notebook, reinforcement-learning

Coursera reinforcement learning

Coursera Reinforcement Learning Specialization by University of Alberta & Alberta Machine Intelligence Institute

Stars: ✭ 114 (-9.52%)

Mutual labels: jupyter-notebook, reinforcement-learning

Ctc Executioner

Master Thesis: Limit order placement with Reinforcement Learning

Stars: ✭ 112 (-11.11%)

Mutual labels: jupyter-notebook, reinforcement-learning

Rl Course Experiments

Stars: ✭ 73 (-42.06%)

Mutual labels: jupyter-notebook, reinforcement-learning

Advanced Deep Learning And Reinforcement Learning Deepmind

🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉

Stars: ✭ 121 (-3.97%)

Mutual labels: jupyter-notebook, reinforcement-learning

Reinforcement Learning Workshop for Data Science BKK

Stars: ✭ 73 (-42.06%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tensorflow Tutorials

TensorFlow Tutorials with YouTube Videos

Stars: ✭ 8,919 (+6978.57%)

Mutual labels: jupyter-notebook, reinforcement-learning

Reinforcement Learning

Implementation of Reinforcement Learning algorithms in Python, based on Sutton's & Barto's Book (Ed. 2)

Stars: ✭ 55 (-56.35%)

Mutual labels: jupyter-notebook, reinforcement-learning

Rl Movie Recommender

The purpose of our research is to study reinforcement learning approaches to building a movie recommender system. We formulate the problem of interactive recommendation as a contextual multi-armed bandit.

Stars: ✭ 93 (-26.19%)

Mutual labels: jupyter-notebook, reinforcement-learning

Machine Learning From Scratch

Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.

Stars: ✭ 42 (-66.67%)

Mutual labels: jupyter-notebook, reinforcement-learning

Policy Gradient Methods

Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC

Stars: ✭ 54 (-57.14%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tensorflow2.0 Examples

🙄 Difficult algorithm, Simple code.

Stars: ✭ 1,397 (+1008.73%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (-3.97%)

Mutual labels: jupyter-notebook, reinforcement-learning

View All Similar Projects ➔

One Policy to Control Them All:
Shared Modular Policies for Agent-Agnostic Control

ICML 2020

[Project Page] [Paper] [Demo Video] [Long Oral Talk]

Wenlong Huang¹, Igor Mordatch², Deepak Pathak^{3 4}

¹University of California, Berkeley, ²Google Brain, ³Facebook AI Research, ⁴Carnegie Mellon University

This is a PyTorch-based implementation of our Shared Modular Policies. We take a step beyond the laborious training process of the conventional single-agent RL policy by tackling the possibility of learning general-purpose controllers for diverse robotic systems. Our approach trains a single policy for a wide variety of agents which can then generalize to unseen agent shapes at test-time without any further training.

If you find this work useful in your research, please cite using the following BibTeX:

@inproceedings{huang2020smp,
  Author = {Huang, Wenlong and
  Mordatch, Igor and Pathak, Deepak},
  Title = {One Policy to Control Them All:
  Shared Modular Policies for Agent-Agnostic Control},
  Booktitle = {ICML},
  Year = {2020}
  }

Setup

Requirements

Python-3.6
PyTorch-1.1.0
CUDA-9.0
CUDNN-7.6
MuJoCo-200: download binaries, put license file inside, and add path to .bashrc

Setting up repository

git clone https://github.com/huangwl18/modular-rl.git
cd modular-rl/
python3.6 -m venv mrEnv
source $PWD/mrEnv/bin/activate

Installing Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Running Code

Flags and Parameters	Description
`--morphologies <List of STRING>`	Find existing environments matching each keyword for training (e.g. walker, hopper, humanoid, and cheetah; see examples below)
`--custom_xml <PATH>`	Path to custom `xml` file for training the modular policy. When `<PATH>` is a file, train with that `xml` morphology only. When `<PATH>` is a directory, train on all `xml` morphologies found in the directory.
`--td`	Enable top-down message passing (pass `--td --bu` for both-way message passing)
`--bu`	Enable bottom-up message passing (pass `--td --bu` for both-way message passing)
`--expID <INT>`	Experiment ID for creating saving directory
`--seed <INT>`	(Optional) Seed for Gym, PyTorch and Numpy

Train with existing environment

Train both-way SMP on Walker++ (12 variants of walker):

python main.py --expID 001 --td --bu --morphologies walker

Train both-way SMP on Humanoid++ (8 variants of 2d humanoid):

python main.py --expID 002 --td --bu --morphologies humanoid

Train both-way SMP on Cheetah++ (15 variants of cheetah):

python main.py --expID 003 --td --bu --morphologies cheetah

Train both-way SMP on Hopper++ (3 variants of hopper):

python main.py --expID 004 --td --bu --morphologies hopper

To train both-way SMP for only one environment (e.g. walker_7_main), specify the full name of the environment without the .xml suffix:

python main.py --expID 005 --td --bu --morphologies walker_7_main

To run with one-way message passing, disable --td for bottom-up-only message passing or disable --bu for top-down-only message passing. To run without any message passing, disable both --td and --bu.

Train with custom environment

Train both-way SMP for only one environment:

python main.py --expID 006 --td --bu --custom_xml <PATH_TO_XML_FILE>

Train both-way SMP for multiple environments (xml files must be in the same directory):

python main.py --expID 007 --td --bu --custom_xml <PATH_TO_XML_DIR>

Note that the current implementation assumes all custom MuJoCo agents are 2D planar and contain only one body tag with name torso attached to worldbody.

Visualization

To visualize all walker environments with the both-way SMP model from experiment expID 001:

python visualize.py --expID 001 --td --bu --morphologies walker

To visualize only walker_7_main environment with the both-way SMP model from experiment expID 001:

python visualize.py --expID 001 --td --bu --morphologies walker_7_main

Provided Environments

Walker
walker_2_main	walker_3_main	walker_4_main	walker_5_main	walker_6_main	walker_7_main
walker_2_flipped	walker_3_flipped	walker_4_flipped	walker_5_flipped	walker_6_flipped	walker_7_flipped

2D Humanoid
humanoid_2d_7_left_arm	humanoid_2d_7_left_leg	humanoid_2d_7_lower_arms	humanoid_2d_7_right_arm
humanoid_2d_7_right_leg	humanoid_2d_8_left_knee	humanoid_2d_8_right_knee	humanoid_2d_9_full

Cheetah
cheetah_2_back	cheetah_2_front	cheetah_3_back	cheetah_3_balanced	cheetah_3_front
cheetah_4_allback	cheetah_4_allfront	cheetah_4_back	cheetah_4_front	cheetah_5_back
cheetah_5_balanced	cheetah_5_front	cheetah_6_back	cheetah_6_front	cheetah_7_full

Hopper
hopper_3	hopper_4	hopper_5

Note that each walker agent has an identical instance of itself called flipped, for which SMP always flips the torso message passed to both legs (e.g. the message that is passed to the left leg in the main instance is now passed the right leg).

Acknowledgement

The TD3 code is based on this open-source implementation. The code for Dynamic Graph Neural Networks is adapted from Modular Assemblies (Pathak*, Lu* et al., NeurIPS 2019).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 126

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗