evgenii-nikishin / omd

Licence: MIT license

JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to omd

jax-rl

JAX implementations of core Deep RL algorithms

Stars: ✭ 61 (+41.86%)

Mutual labels: deep-reinforcement-learning, flax, sac, jax, soft-actor-critic

get-started-with-JAX

The purpose of this repo is to make it easy to get started with JAX, Flax, and Haiku. It contains my "Machine Learning with JAX" series of tutorials (YouTube videos and Jupyter Notebooks) as well as the content I found useful while learning about the JAX ecosystem.

Stars: ✭ 229 (+432.56%)

Mutual labels: flax, haiku, jax

proto

Proto-RL: Reinforcement Learning with Prototypical Representations

Stars: ✭ 67 (+55.81%)

Mutual labels: gym, sac, soft-actor-critic

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+416.28%)

Mutual labels: deep-reinforcement-learning, dqn, sac

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+4669.77%)

Mutual labels: deep-reinforcement-learning, dqn, sac

Rainy

☔ Deep RL agents with PyTorch☔

Stars: ✭ 39 (-9.3%)

Mutual labels: deep-reinforcement-learning, dqn, sac

learning-to-drive-in-5-minutes

Implementation of reinforcement learning approach to make a car learn to drive smoothly in minutes

Stars: ✭ 227 (+427.91%)

Mutual labels: gym, sac, soft-actor-critic

Meta-SAC

Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient - 7th ICML AutoML workshop 2020

Stars: ✭ 19 (-55.81%)

Mutual labels: deep-reinforcement-learning, sac, soft-actor-critic

Explorer

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Stars: ✭ 54 (+25.58%)

Mutual labels: deep-reinforcement-learning, dqn, gym

Deep RL with pytorch

A pytorch tutorial for DRL(Deep Reinforcement Learning)

Stars: ✭ 160 (+272.09%)

Mutual labels: deep-reinforcement-learning, dqn, soft-actor-critic

Deep-Reinforcement-Learning-Notebooks

This Repository contains a series of google colab notebooks which I created to help people dive into deep reinforcement learning.This notebooks contain both theory and implementation of different algorithms.

Stars: ✭ 15 (-65.12%)

Mutual labels: deep-reinforcement-learning, dqn, soft-actor-critic

Pytorch Rl

This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch

Stars: ✭ 394 (+816.28%)

Mutual labels: deep-reinforcement-learning, dqn, gym

Drl

Repository for codes of 'Deep Reinforcement Learning'

Stars: ✭ 172 (+300%)

Mutual labels: deep-reinforcement-learning, dqn

Deep Reinforcement Learning Algorithms

31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.

Stars: ✭ 167 (+288.37%)

Mutual labels: deep-reinforcement-learning, dqn

king-pong

Deep Reinforcement Learning Pong Agent, King Pong, he's the best

Stars: ✭ 23 (-46.51%)

Mutual labels: deep-reinforcement-learning, dqn

Pytorch sac

PyTorch implementation of Soft Actor-Critic (SAC)

Stars: ✭ 174 (+304.65%)

Mutual labels: deep-reinforcement-learning, gym

Machine Learning Is All You Need

🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!

Stars: ✭ 173 (+302.33%)

Mutual labels: deep-reinforcement-learning, dqn

yarll

Combining deep learning and reinforcement learning.

Stars: ✭ 84 (+95.35%)

Mutual labels: deep-reinforcement-learning, soft-actor-critic

Naf Tensorflow

"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow

Stars: ✭ 192 (+346.51%)

Mutual labels: deep-reinforcement-learning, gym

Deeprl

Modularized Implementation of Deep RL Algorithms in PyTorch

Stars: ✭ 2,640 (+6039.53%)

Mutual labels: deep-reinforcement-learning, dqn

View All Similar Projects ➔

Optimal Model Design for Reinforcement Learning

This repository contains JAX code for the paper

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

by Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, and Pierre-Luc Bacon.

Summary

Model based reinforcement learning typically trains the dynamics and reward functions by minimizing the error of predictions. The error is only a proxy to maximizing the sum of rewards, the ultimate goal of the agent, leading to the objective mismatch. We propose an end-to-end algorithm called Optimal Model Design (OMD) that optimizes the returns directly for model learning. OMD leverages the implicit function theorem to optimize the model parameters and forms the following computational graph:

Please cite our work if you find it useful in your research:

@article{nikishin2021control,
  title={Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation},
  author={Nikishin, Evgenii and Abachi, Romina and Agarwal, Rishabh and Bacon, Pierre-Luc},
  journal={arXiv preprint arXiv:2106.03273},
  year={2021}
}

Installation

We assume that you use Python 3. To install the necessary dependencies, run the following commands:

1. virtualenv ~/env_omd
2. source ~/env_omd/bin/activate
3. pip install -r requirements.txt

To use JAX with GPU, follow the official instructions. To install MuJoCo, check the instructions.

Run

For historical reasons, the code is divided into 3 parts.

Tabular

All results for the tabular experiments could be reproduced by running the tabular.ipynb notebook.

To open the notebook in Google Colab, use this link.

CartPole

To train the OMD agent on CartPole, use the following commands:

cd cartpole
python train.py --agent_type omd

We also provide the implementation of the corresponding MLE and VEP baselines. To train the agents, change the --agent_type flag to mle or vep.

MuJoCo

To train the OMD agent on MuJoCo HalfCheetah-v2, use the following commands:

cd mujoco
python train.py --config.algo=omd

To train the MLE baseline, change the --config.algo flag to mle.

Acknowledgements

Tabular experiments are based on the code from the library for fixed points in JAX
Code for MuJoCo is based on the implementation of SAC in JAX
Code for CartPole reuses parts of the SAC implementation in PyTorch
For experimentation, we used a moditication of the slurm runner

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

evgenii-nikishin / omd

Programming Languages

Labels

Projects that are alternatives of or similar to omd

Optimal Model Design for Reinforcement Learning

Summary

Installation

Run

Tabular

CartPole

MuJoCo

Acknowledgements