Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+1085.59%)

Mutual labels: reinforcement-learning

Reinforcement Learning An Introduction Chinese

《Reinforcement Learning: An Introduction》（第二版）中文翻译

Stars: ✭ 210 (-5.41%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

Probabilistic Inference for Learning Control (PILCO)

A modern & clean implementation of the PILCO Algorithm in TensorFlow v2.

Unlike PILCO's original implementation which was written as a self-contained package of MATLAB, this repository aims to provide a clean implementation by heavy use of modern machine learning libraries.

In particular, we use TensorFlow v2 to avoid the need for hardcoded gradients and scale to GPU architectures. Moreover, we use GPflow v2 for Gaussian Process Regression.

The core functionality is tested against the original MATLAB implementation.

Example of usage

Before using PILCO you have to install it by running:

git clone https://github.com/nrontsis/PILCO && cd PILCO
python setup.py develop

It is recommended to install everything in a fresh conda environment with python>=3.7

The examples included in this repo use OpenAI gym 0.15.3 and mujoco-py 2.0.2.7. Theses dependecies should be installed manually. Then, you can run one of the examples as follows

python examples/inverted_pendulum.py

Example Extension: Safe PILCO

As an example of the extensibility of the framework, we include in the folder safe_pilco_extension an extension of the standard PILCO algorithm that takes safety constraints (defined on the environment's state space) into account as in https://arxiv.org/abs/1712.05556. The safe_swimmer_run.py and safe_cars_run.py in the examples folder demonstrate the use of this extension.

Credits:

The following people have been involved in the development of this package:

References

See the following publications for a description of the algorithm: 1, 2, 3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 222

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (14) 🔗