Status: Under development (expect bug fixes and huge updates)

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

ShinRL is an open-source JAX library specialized for the evaluation of reinforcement learning (RL) algorithms from both theoretical and practical perspectives. Please take a look at the paper for details. Try ShinRL at experiments/QuickStart.ipynb.

QuickStart

import gym
from shinrl import DiscreteViSolver
import matplotlib.pyplot as plt

# make an env & a config
env = gym.make("ShinPendulum-v0")
config = DiscreteViSolver.DefaultConfig(explore="eps_greedy", approx="nn", steps_per_epoch=10000)

# make & run a solver
mixins = DiscreteViSolver.make_mixins(env, config)
dqn_solver = DiscreteViSolver.factory(env, config, mixins)
dqn_solver.run()

# plot performance
returns = dqn_solver.scalars["Return"]
plt.plot(returns["x"], returns["y"])

# plot learned q-values  (action == 0)
q0 = dqn_solver.data["Q"][:, 0]
env.plot_S(q0, title="Learned")

⚡ Key Modules

🔬 ShinEnv for Oracle Analysis

ShinEnv provides small environments with oracle methods that can compute exact quantities.
Some environments support continuous action space and image observation:
See the tutorial for details: experiments/Tutorials/ShinEnvTutorial.ipynb.

Environment	Discrete action	Continuous action	Image Observation	Tuple Observation
ShinMaze	✔️	❌	❌	✔️
ShinMountainCar-v0	✔️	✔️	✔️	✔️
ShinPendulum-v0	✔️	✔️	✔️	✔️
ShinCartPole-v0	✔️	✔️	❌	✔️

🏭 Flexible Solver by MixIn

A Solver solves an environment with specified algorithms.
A "mixin" is a class which defines and implements a single feature. ShinRL's solvers are instantiated by mixing some mixins.
See the tutorial for details: experiments/Tutorials/SolverTutorial.ipynb.

Implemented Popular Algorithms

The table bellow lists the implemented popular algorithms.
Note that it does not list all the implemented algorithms (e.g., DDP ¹ version of the DQN algorithm). See make_mixin functions of solvers for implemented variants.
Note that the implemented algorithms may differ from the original implementation for simplicity (e.g., Discrete SAC). See source code of solvers for details.

Algorithm	Solver	Configuration	Type ¹
Value Iteration (VI)	DiscreteViSolver	`approx == "tabular" & explore == "oracle"`	TDP
Policy Iteration (PI)	DiscretePiSolver	`approx == "tabular" & explore == "oracle"`	TDP
Conservative Value Iteration (CVI)	DiscreteViSolver	`approx == "tabular" & explore == "oracle & er_coef != 0 & kl_coef != 0"`	TDP
Tabular Q Learning	DiscreteViSolver	`approx == "tabular" & explore != "oracle"`	TRL
SARSA	DiscretePiSolver	`approx == "tabular" & explore != "oracle" & eps_decay_target_pol > 0`	TRL
Deep Q Network (DQN)	DiscreteViSolver	`approx == "nn" & explore != "oracle"`	DRL
Soft DQN	DiscreteViSolver	`approx == "nn" & explore != "oracle" & er_coef != 0`	DRL
Munchausen-DQN	DiscreteViSolver	`approx == "nn" & explore != "oracle" & er_coef != 0 & kl_coef != 0`	DRL
Double-DQN	DiscreteViSolver	`approx == "nn" & explore != "oracle" & use_double_q == True`	DRL
Discrete Soft Actor Critic	DiscretePiSolver	`approx == "nn" & explore != "oracle" & er_coef != 0`	DRL
Deep Deterministic Policy Gradient (DDPG)	ContinuousDdpgSolver	`approx == "nn" & explore != "oracle"`	DRL

1 Algorithm Type:

TDP (approx=="tabular" & explore=="oracle"): Tabular Dynamic Programming algorithms. No exploration & no approximation & the complete specification about the MDP is given.
TRL (approx=="tabular" & explore!="oracle"): Tabular Reinforcement Learning algorithms. No approximation & the dynamics and the reward functions are unknown.
DDP (approx=="nn" & explore=="oracle"): Deep Dynamic Programming algorithms. It is the same as TDP, except that neural networks approximate computed values.
DRL (approx=="nn" & explore!="oracle"): Deep Reinforcement Learning algorithms. It is the same as TRL, except that neural networks approximate computed values.

Installation

git clone [email protected]:omron-sinicx/ShinRL.git
cd ShinRL
pip install -e .

Test

cd ShinRL
make test

Format

cd ShinRL
make format

Docker

cd ShinRL
docker-compose up

Citation

# Neurips DRL WS 2021 version (pytorch branch)
@inproceedings{toshinori2021shinrl,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    booktitle = {Proceedings of the NeurIPS Deep RL Workshop},
}

# Arxiv version (commit 2d3da)
@article{toshinori2021shinrlArxiv,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    url = {https://arxiv.org/abs/2112.04123},
    journal={arXiv preprint arXiv:2112.04123},
}

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

omron-sinicx / ShinRL

Programming Languages

Labels

Projects that are alternatives of or similar to ShinRL