All Projects → Kaixhin → Spinning Up Basic

Kaixhin / Spinning Up Basic

Licence: mit
Basic versions of agents from Spinning Up in Deep RL written in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Spinning Up Basic

Pytorch Trpo
PyTorch Implementation of Trust Region Policy Optimization (TRPO)
Stars: ✭ 123 (-20.65%)
Mutual labels:  deep-reinforcement-learning
Deep Qlearning Agent For Traffic Signal Control
A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.
Stars: ✭ 136 (-12.26%)
Mutual labels:  deep-reinforcement-learning
Scalphagozero
An independent implementation of DeepMind's AlphaGoZero in Scala, using Deeplearning4J (DL4J)
Stars: ✭ 144 (-7.1%)
Mutual labels:  deep-reinforcement-learning
Muzero Pytorch
Pytorch Implementation of MuZero
Stars: ✭ 129 (-16.77%)
Mutual labels:  deep-reinforcement-learning
Adnet
Attention-guided CNN for image denoising(Neural Networks,2020)
Stars: ✭ 135 (-12.9%)
Mutual labels:  deep-reinforcement-learning
Finrl Library
FinRL: Financial Reinforcement Learning Framework. Please star. 🔥
Stars: ✭ 3,037 (+1859.35%)
Mutual labels:  deep-reinforcement-learning
Rl Medical
Deep Reinforcement Learning (DRL) agents applied to medical images
Stars: ✭ 123 (-20.65%)
Mutual labels:  deep-reinforcement-learning
Go Bot Drl
Goal-Oriented Chatbot trained with Deep Reinforcement Learning
Stars: ✭ 149 (-3.87%)
Mutual labels:  deep-reinforcement-learning
Policy Gradient
Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
Stars: ✭ 135 (-12.9%)
Mutual labels:  deep-reinforcement-learning
Pcgrad
Code for "Gradient Surgery for Multi-Task Learning"
Stars: ✭ 144 (-7.1%)
Mutual labels:  deep-reinforcement-learning
Deep Reinforcement Learning In Large Discrete Action Spaces
Implementation of the algorithm in Python 3, TensorFlow and OpenAI Gym
Stars: ✭ 132 (-14.84%)
Mutual labels:  deep-reinforcement-learning
Ml Agents
Unity Machine Learning Agents Toolkit
Stars: ✭ 12,134 (+7728.39%)
Mutual labels:  deep-reinforcement-learning
D3rlpy
An offline deep reinforcement learning library
Stars: ✭ 139 (-10.32%)
Mutual labels:  deep-reinforcement-learning
A Deep Rl Approach For Sdn Routing Optimization
A Deep-Reinforcement Learning Approach for Software-Defined Networking Routing Optimization
Stars: ✭ 125 (-19.35%)
Mutual labels:  deep-reinforcement-learning
Baby A3c
A high-performance Atari A3C agent in 180 lines of PyTorch
Stars: ✭ 144 (-7.1%)
Mutual labels:  deep-reinforcement-learning
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (-20%)
Mutual labels:  deep-reinforcement-learning
Machine Learning And Data Science
This is a repository which contains all my work related Machine Learning, AI and Data Science. This includes my graduate projects, machine learning competition codes, algorithm implementations and reading material.
Stars: ✭ 137 (-11.61%)
Mutual labels:  deep-reinforcement-learning
Awesome Deep Neuroevolution
A collection of Deep Neuroevolution resources or evolutionary algorithms applying in Deep Learning (constantly updating)
Stars: ✭ 150 (-3.23%)
Mutual labels:  deep-reinforcement-learning
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+1223.23%)
Mutual labels:  deep-reinforcement-learning
Deep Learning Papers Reading Roadmap
深度学习论文阅读路线图
Stars: ✭ 142 (-8.39%)
Mutual labels:  deep-reinforcement-learning

spinning-up-basic

Basic versions of agents from Spinning Up in Deep RL written in PyTorch. Designed to run quickly on CPU on Pendulum-v0 from OpenAI Gym.

To see differences between algorithms, try running diff -y <file1> <file2>, e.g., diff -y ddpg.py td3.py.

For MPI versions of on-policy algorithms, see the mpi branch.

Algorithms

Implementation Details

Note that implementation details can have a significant effect on performance, as discussed in What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. This codebase attempts to be as simple as possible, but note that for instance on-policy algorithms use separate actor and critic networks, a state-independent policy standard deviation, per-minibatch advantage normalisation, and several critic updates per minibatch, while the deterministic off-policy algorithms use layer normalisation. Equally, soft actor-critic uses a transformed Normal distribution by default, but this can also help the on-policy algorithms.

Results

Vanilla Policy Gradient/Advantage Actor-Critic

VPG

Trust Region Policy Gradient

TRPO

Proximal Policy Optimization

PPO

Deep Deterministic Policy Gradient

DDPG

Twin Delayed DDPG

TD3

Soft Actor-Critic

SAC

Deep Q-Network

DQN

Code Links

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].