Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → MrSyee → Pg Is All You Need

MrSyee / Pg Is All You Need

Licence: mit

Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.

Labels

jupyter-notebook

Projects that are alternatives of or similar to Pg Is All You Need

Handson Unsupervised Learning

Code for Hands-on Unsupervised Learning Using Python (O'Reilly Media)

Stars: ✭ 369 (-0.81%)

Mutual labels: jupyter-notebook

Interesting Python

有趣的Python爬虫和Python数据分析小项目(Some interesting Python crawlers and data analysis projects)

Stars: ✭ 3,927 (+955.65%)

Mutual labels: jupyter-notebook

Iclr2019 Openreviewdata

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

Stars: ✭ 376 (+1.08%)

Mutual labels: jupyter-notebook

Contains my explorations of TensorFlow 2.x

Stars: ✭ 369 (-0.81%)

Mutual labels: jupyter-notebook

Jupyter Kernel for Matlab

Stars: ✭ 373 (+0.27%)

Mutual labels: jupyter-notebook

Repo for the Deep Learning Nanodegree Foundations program.

Stars: ✭ 3,782 (+916.67%)

Mutual labels: jupyter-notebook

This project reproduces the book Dive Into Deep Learning (https://d2l.ai/), adapting the code from MXNet into PyTorch.

Stars: ✭ 3,810 (+924.19%)

Mutual labels: jupyter-notebook

Kind Pytorch Tutorial

Kind PyTorch Tutorial for beginners

Stars: ✭ 377 (+1.34%)

Mutual labels: jupyter-notebook

Collection of useful data science topics along with code and articles

Stars: ✭ 315 (-15.32%)

Mutual labels: jupyter-notebook

Python Machine Learning Second Edition

Python Machine Learning - Second Edition, published by Packt

Stars: ✭ 376 (+1.08%)

Mutual labels: jupyter-notebook

Causal Inference Tutorial

Repository with code and slides for a tutorial on causal inference.

Stars: ✭ 368 (-1.08%)

Mutual labels: jupyter-notebook

Deep Learning Illustrated

Deep Learning Illustrated (2020)

Stars: ✭ 372 (+0%)

Mutual labels: jupyter-notebook

Data Science Using Python

Stars: ✭ 4,080 (+996.77%)

Mutual labels: jupyter-notebook

😷️🇵🇹 Dados relativos à pandemia COVID-19 em Portugal

Stars: ✭ 362 (-2.69%)

Mutual labels: jupyter-notebook

Nlp Python Deep Learning

NLP in Python with Deep Learning

Stars: ✭ 374 (+0.54%)

Mutual labels: jupyter-notebook

Fbrs interactive segmentation

[CVPR2020] f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation https://arxiv.org/abs/2001.10331

Stars: ✭ 366 (-1.61%)

Mutual labels: jupyter-notebook

Jupyter notebooks for teaching/learning Python 3

Stars: ✭ 4,418 (+1087.63%)

Mutual labels: jupyter-notebook

Deep Learning Nano Foundation

Udacity's Deep Learning Nano Foundation program.

Stars: ✭ 377 (+1.34%)

Mutual labels: jupyter-notebook

Scikit Learn Book

Source code for the "Learning scikit-learn: Machine Learning in Python"

Stars: ✭ 376 (+1.08%)

Mutual labels: jupyter-notebook

Over9000 optimizer

Stars: ✭ 375 (+0.81%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

PG is all you need!

This is a step-by-step tutorial for Policy Gradient algorithms from A2C to SAC, including learning acceleration methods using demonstrations for treating real applications with sparse rewards. Every chapter contains both of theoretical backgrounds and object-oriented implementation. Just pick any topic in which you are interested, and learn! You can execute them right away with Colab even on your smartphone.

Please feel free to open an issue or a pull-request if you have any idea to make it better. :)

If you want a tutorial for DQN series, please see Rainbow is All You Need.

Contents

Advantage Actor-Critic (A2C) [NBViewer] [Colab]
Proximal Policy Optimization Algorithms (PPO) [NBViewer] [Colab]
Deep Deterministic Policy Gradient (DDPG) [NBViewer] [Colab]
Twin Delayed Deep Deterministic Policy Gradient Algorithm (TD3) [NBViewer] [Colab]
Soft Actor-Critic (SAC) [NBViewer] [Colab]
DDPG from Demonstration (DDPGfD) [NBViewer] [Colab]
Behavior Cloning (with DDPG) [NBViewer] [Colab]

Environment

Pendulum-v0

Reference: OpenAI gym Pendulum-v0

Observation

Type: Box(3)

Num	Observation	Min	Max
0	cos(theta)	-1.0	1.0
1	sin(theta)	-1.0	1.0
2	theta dot	-8.0	8.0

Actions

Type: Box(1)

Num	Action	Min	Max
0	Joint effort	-2.0	2.0

Reward

The precise equation for reward:

-(theta^2 + 0.1*theta_dt^2 + 0.001*action^2)

Theta is normalized between -pi and pi. Therefore, the lowest cost is -(pi^2 + 0.1*8^2 + 0.001*2^2) = -16.2736044, and the highest cost is 0. In essence, the goal is to remain at zero angle (vertical), with the least rotational velocity, and the least effort. Max steps per an episode is 200 steps.

Prerequisites

This repository is tested on Anaconda virtual environment with python 3.6.1+

$ conda create -n pg-is-all-you-need python=3.6.9
$ conda activate pg-is-all-you-need

Installation

First, clone the repository.

git clone https://github.com/MrSyee/pg-is-all-you-need.git
cd pg-is-all-you-need

Secondly, install packages required to execute the code. Just type:

make dep

Development

Install packages required to develop the code:

make dev

If you want to check the difference of jupyter files that you modified, use nbdime:

nbdiff-web

Related Papers

Contributors

Thanks goes to these wonderful people (emoji key):

_{Kyunghwan Kim}
💻 📖

_{Jinwoo Park (Curt)}
💻 📖

_{Mincheol Kim}
💻 📖

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 372

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗