All Projects → nrontsis → Pilco

nrontsis / Pilco

Licence: mit
Bayesian Reinforcement Learning in Tensorflow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pilco

Safeopt
Safe Bayesian Optimization
Stars: ✭ 90 (-59.46%)
Mutual labels:  reinforcement-learning, gaussian-processes
Numpy Ml
Machine learning, in numpy
Stars: ✭ 11,100 (+4900%)
Mutual labels:  reinforcement-learning, gaussian-processes
Safe learning
Safe reinforcement learning with stability guarantees
Stars: ✭ 140 (-36.94%)
Mutual labels:  reinforcement-learning, gaussian-processes
Epg
Code for the paper "Evolved Policy Gradients"
Stars: ✭ 204 (-8.11%)
Mutual labels:  reinforcement-learning
Rl trading
An environment to high-frequency trading agents under reinforcement learning
Stars: ✭ 205 (-7.66%)
Mutual labels:  reinforcement-learning
Pytorch Reinforce
PyTorch Implementation of REINFORCE for both discrete & continuous control
Stars: ✭ 212 (-4.5%)
Mutual labels:  reinforcement-learning
Keras Gp
Keras + Gaussian Processes: Learning scalable deep and recurrent kernels.
Stars: ✭ 218 (-1.8%)
Mutual labels:  gaussian-processes
Gym Unrealcv
Unreal environments for reinforcement learning
Stars: ✭ 202 (-9.01%)
Mutual labels:  reinforcement-learning
Autodrome
Framework and OpenAI Gym Environment for Autonomous Vehicle Development
Stars: ✭ 214 (-3.6%)
Mutual labels:  reinforcement-learning
Gymfc
A universal flight control tuning framework
Stars: ✭ 210 (-5.41%)
Mutual labels:  reinforcement-learning
Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+1057.66%)
Mutual labels:  reinforcement-learning
Rl Tutorial Jnrr19
Stable-Baselines tutorial for Journées Nationales de la Recherche en Robotique 2019
Stars: ✭ 204 (-8.11%)
Mutual labels:  reinforcement-learning
Awesome Deeplearning Resources
Deep Learning and deep reinforcement learning research papers and some codes
Stars: ✭ 2,483 (+1018.47%)
Mutual labels:  reinforcement-learning
Minerva
Meandering In Networks of Entities to Reach Verisimilar Answers
Stars: ✭ 205 (-7.66%)
Mutual labels:  reinforcement-learning
Icnn
Input Convex Neural Networks
Stars: ✭ 214 (-3.6%)
Mutual labels:  reinforcement-learning
Multihopkg
Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout
Stars: ✭ 202 (-9.01%)
Mutual labels:  reinforcement-learning
Pokerrl
Framework for Multi-Agent Deep Reinforcement Learning in Poker
Stars: ✭ 214 (-3.6%)
Mutual labels:  reinforcement-learning
Gpytorch
A highly efficient and modular implementation of Gaussian Processes in PyTorch
Stars: ✭ 2,622 (+1081.08%)
Mutual labels:  gaussian-processes
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+1085.59%)
Mutual labels:  reinforcement-learning
Reinforcement Learning An Introduction Chinese
《Reinforcement Learning: An Introduction》(第二版)中文翻译
Stars: ✭ 210 (-5.41%)
Mutual labels:  reinforcement-learning

Probabilistic Inference for Learning Control (PILCO)

Build Status codecov

A modern & clean implementation of the PILCO Algorithm in TensorFlow v2.

Unlike PILCO's original implementation which was written as a self-contained package of MATLAB, this repository aims to provide a clean implementation by heavy use of modern machine learning libraries.

In particular, we use TensorFlow v2 to avoid the need for hardcoded gradients and scale to GPU architectures. Moreover, we use GPflow v2 for Gaussian Process Regression.

The core functionality is tested against the original MATLAB implementation.

Example of usage

Before using PILCO you have to install it by running:

git clone https://github.com/nrontsis/PILCO && cd PILCO
python setup.py develop

It is recommended to install everything in a fresh conda environment with python>=3.7

The examples included in this repo use OpenAI gym 0.15.3 and mujoco-py 2.0.2.7. Theses dependecies should be installed manually. Then, you can run one of the examples as follows

python examples/inverted_pendulum.py

Example Extension: Safe PILCO

As an example of the extensibility of the framework, we include in the folder safe_pilco_extension an extension of the standard PILCO algorithm that takes safety constraints (defined on the environment's state space) into account as in https://arxiv.org/abs/1712.05556. The safe_swimmer_run.py and safe_cars_run.py in the examples folder demonstrate the use of this extension.

Credits:

The following people have been involved in the development of this package:

References

See the following publications for a description of the algorithm: 1, 2, 3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].