Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (+1909.09%)

Mutual labels: policy-gradient, imitation-learning

A2c

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow

Stars: ✭ 169 (+668.18%)

Mutual labels: policy-gradient

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+6013.64%)

Mutual labels: policy-gradient

Reinforcement learning

Reinforcement learning tutorials

Stars: ✭ 82 (+272.73%)

Mutual labels: policy-gradient

Rl Course Experiments

Stars: ✭ 73 (+231.82%)

Mutual labels: policy-gradient

Pontryagin-Differentiable-Programming

A unified end-to-end learning and control framework that is able to learn a (neural) control objective function, dynamics equation, control policy, or/and optimal trajectory in a control system.

Stars: ✭ 111 (+404.55%)

Mutual labels: imitation-learning

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+12913.64%)

Mutual labels: policy-gradient

Policy Gradient

Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras

Stars: ✭ 135 (+513.64%)

Mutual labels: policy-gradient

Reinforcement learning

강화학습에 대한 기본적인 알고리즘 구현

Stars: ✭ 100 (+354.55%)

Mutual labels: policy-gradient

Deep Algotrading

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Stars: ✭ 173 (+686.36%)

Mutual labels: policy-gradient

Deeprl algorithms

DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)

Stars: ✭ 97 (+340.91%)

Mutual labels: policy-gradient

SharkStock

Automate swing trading using deep reinforcement learning. The deep deterministic policy gradient-based neural network model trains to choose an action to sell, buy, or hold the stocks to maximize the gain in asset value. The paper also acknowledges the need for a system that predicts the trend in stock value to work along with the reinforcement …

Stars: ✭ 63 (+186.36%)

Mutual labels: policy-gradient

Codegan

[Deprecated] Source Code Generation using Sequence Generative Adversarial Networks

Stars: ✭ 73 (+231.82%)

Mutual labels: policy-gradient

Mlds2018spring

Machine Learning and having it Deep and Structured (MLDS) in 2018 spring

Stars: ✭ 124 (+463.64%)

Mutual labels: policy-gradient

Pytorch Rl

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (+450%)

Mutual labels: policy-gradient

Show Adapt And Tell

Code for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017

Stars: ✭ 146 (+563.64%)

Mutual labels: policy-gradient

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+909.09%)

Mutual labels: policy-gradient

View All Similar Projects ➔

Ranking Policy Gradient

Ranking Policy Gradient (RPG) is a sample-efficient off-policy policy gradient method that learns optimal ranking of actions to maximize the return. RPG has the following practical advantages:

It is a sample-efficient model-free algorithm for learning deterministic policies.
It is effortless to incorporate any exploration algorithm to improve the sample-efficiency of RPG further.

This codebase contains the implementation of RPG using the dopamine framework. The preprint of the RPG paper is available here.

Instructions

Install via source

Step 1.

Follow the install instruction of dopamine framework for Ubuntu or Max OS X.

Step 2.

Download the RPG source, i.e.

git clone [email protected]:illidanlab/rpg.git

Running the tests

cd ./rpg/dopamine 
python -um dopamine.atari.train \
  --agent_name=rpg \
  --base_dir=/tmp/dopamine \
  --random_seed 1 \
  --game_name=Pong \
  --gin_files='dopamine/agents/rpg/configs/rpg.gin'

Reproduce

To reproduce the results in the paper, please refer to the instruction in here.

Reference

If you use this RPG implementation in your work, please consider citing the following papers:

@article{lin2019ranking,
  title={Ranking Policy Gradient},
  author={Lin, Kaixiang and Zhou, Jiayu},
  journal={arXiv preprint arXiv:1906.09674},
  year={2019}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

illidanlab / rpg

Programming Languages

Labels

Projects that are alternatives of or similar to rpg

Ranking Policy Gradient

Instructions

Install via source

Step 1.

Step 2.

Running the tests

Reproduce

Reference