Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → cyoon1729 → Policy Gradient Methods

cyoon1729 / Policy Gradient Methods

Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC

Labels

jupyter-notebook pytorch reinforcement-learning a3c ddpg

Projects that are alternatives of or similar to Policy Gradient Methods

Reinforcement Learning With Tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Stars: ✭ 6,948 (+12766.67%)

Mutual labels: reinforcement-learning, a3c, ddpg

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+3698.15%)

Mutual labels: reinforcement-learning, a3c, ddpg

Easy Rl

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+5462.96%)

Mutual labels: reinforcement-learning, a3c, ddpg

Rlcycle

A library for ready-made reinforcement learning agents and reusable components for neat prototyping

Stars: ✭ 184 (+240.74%)

Mutual labels: reinforcement-learning, a3c, ddpg

Lagom

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

Stars: ✭ 364 (+574.07%)

Mutual labels: jupyter-notebook, reinforcement-learning, ddpg

Deep Rl Keras

Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)

Stars: ✭ 395 (+631.48%)

Mutual labels: reinforcement-learning, a3c, ddpg

Machin

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Stars: ✭ 145 (+168.52%)

Mutual labels: reinforcement-learning, a3c, ddpg

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (+118.52%)

Mutual labels: jupyter-notebook, reinforcement-learning, a3c

Deeprl Tensorflow2

🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2

Stars: ✭ 319 (+490.74%)

Mutual labels: reinforcement-learning, a3c, ddpg

Deep Reinforcement Learning

Repo for the Deep Reinforcement Learning Nanodegree program

Stars: ✭ 4,012 (+7329.63%)

Mutual labels: jupyter-notebook, reinforcement-learning, ddpg

Reinforcement learning tutorial with demo

Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (+718.52%)

Mutual labels: jupyter-notebook, reinforcement-learning, a3c

Btgym

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (+1316.67%)

Mutual labels: reinforcement-learning, a3c

Hands On Meta Learning With Python

Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow

Stars: ✭ 768 (+1322.22%)

Mutual labels: jupyter-notebook, reinforcement-learning

Deeprl Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Stars: ✭ 748 (+1285.19%)

Mutual labels: jupyter-notebook, reinforcement-learning

Super Mario Bros A3c Pytorch

Asynchronous Advantage Actor-Critic (A3C) algorithm for Super Mario Bros

Stars: ✭ 775 (+1335.19%)

Mutual labels: reinforcement-learning, a3c

Coursera

Quiz & Assignment of Coursera

Stars: ✭ 774 (+1333.33%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tensorlayer

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Stars: ✭ 6,796 (+12485.19%)

Mutual labels: reinforcement-learning, a3c

Bombora

My experimentations with Reinforcement Learning in Pytorch

Stars: ✭ 18 (-66.67%)

Mutual labels: reinforcement-learning, a3c

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+1574.07%)

Mutual labels: reinforcement-learning, a3c

Pytorch Rl

Deep Reinforcement Learning with pytorch & visdom

Stars: ✭ 745 (+1279.63%)

Mutual labels: reinforcement-learning, a3c

View All Similar Projects ➔

Policy-Gradient-Methods

Author: Chris Yoon

Implementations of important policy gradient algorithms in deep reinforcement learning.

Implementations

Advantage Actor-Critic (A2C)

Paper: "Asynchronous Methods for Deep Reinforcement Learning" (Mnih et al., 2016)
Asynchronous Advantage Actor-Critic (A3C)

Paper: "Asynchronous Methods for Deep Reinforcement Learning" (Mnih et al., 2016)
Deep Deterministic Policy Gradients (DDPG)

Paper: "Continuous control with deep reinforcement learning" (Lillicrap et al., 2015)
Twin Dueling Deep Deterministic Policy Gradients (TD3)

Paper: "Addressing Function Approximation Error in Actor-Critic Methods" (Fujimoto et al., 2018)
Soft Actor Critic (SAC)
- Paper (sac2018.py): "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" (Haarnoja et al., 2018)
- Paper (sac2019.py): "Soft Actor-Critic Algorithms and Applications" (Haarnoja et al., 2019)
- Algorithm in 2018 paper uses value network, double Q networks, and Gaussian policy. Algorithm in 2019 paper uses double Q networks and Gaussian policy, and adds automatic entropy tuning.
- TODO: SAC for discrete action space

More implementations will be added soon.

Known Dependencies

Python 3.6
PyTorch 1.0.2
gym 0.12.5

How to run:

Install package

git clone [email protected]:cyoon1729/Policy-Gradient-Methods.git
cd Policy-Gradient-Methods
pip install .

Example:

import gym

from policygradients.common.utils import mini_batch_train  # import training function
from policygradients.td3.td3 import TD3Agent  # import agent from algorithm of interest

# Create Gym environment
env = gym.make("Pendulum-v0")

# check agent class for initialization parameters and initialize agent
gamma = 0.99
tau = 1e-2
noise_std = 0.2
bound = 0.5
delay_step = 2
buffer_maxlen = 100000
critic_lr = 1e-3
actor_lr = 1e-3

agent = TD3Agent(env, gamma, tau, buffer_maxlen, delay_step, noise_std, bound, critic_lr, actor_lr)

# define training parameters
max_episodes = 100
max_steps = 500
batch_size = 32

# train agent with mini_batch_train function
episode_rewards = mini_batch_train(env, agent, max_episodes, max_steps, batch_size)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 54

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗