All Projects → denisyarats → Pytorch_sac

denisyarats / Pytorch_sac

Licence: mit
PyTorch implementation of Soft Actor-Critic (SAC)

Projects that are alternatives of or similar to Pytorch sac

Drq
DrQ: Data regularized Q
Stars: ✭ 268 (+54.02%)
Mutual labels:  gym, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco
Pytorch sac ae
PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)
Stars: ✭ 94 (-45.98%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+1412.64%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco
Rl algos
Reinforcement Learning Algorithms
Stars: ✭ 14 (-91.95%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning, actor-critic
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (+329.89%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic
Reinforcement learning tutorial with demo
Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..
Stars: ✭ 442 (+154.02%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+109.2%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, mujoco
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-32.18%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic
Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (+126.44%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning, mujoco
Rl Book
Source codes for the book "Reinforcement Learning: Theory and Python Implementation"
Stars: ✭ 464 (+166.67%)
Mutual labels:  gym, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Drlkit
A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms
Stars: ✭ 29 (-83.33%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning
Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+405.17%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, actor-critic
Deterministic Gail Pytorch
PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning
Stars: ✭ 44 (-74.71%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning
Rl Course Experiments
Stars: ✭ 73 (-58.05%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Torch Ac
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Stars: ✭ 70 (-59.77%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, actor-critic
Paac.pytorch
Pytorch implementation of the PAAC algorithm presented in Efficient Parallel Methods for Deep Reinforcement Learning https://arxiv.org/abs/1705.04862
Stars: ✭ 22 (-87.36%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning
Muzero General
MuZero
Stars: ✭ 1,187 (+582.18%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning
Rlenv.directory
Explore and find reinforcement learning environments in a list of 150+ open source environments.
Stars: ✭ 79 (-54.6%)
Mutual labels:  gym, reinforcement-learning, deep-reinforcement-learning
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-48.28%)
Mutual labels:  gym, reinforcement-learning, mujoco
Pytorch Rl
Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Stars: ✭ 121 (-30.46%)
Mutual labels:  jupyter-notebook, reinforcement-learning, actor-critic

Soft Actor-Critic (SAC) implementation in PyTorch

This is PyTorch implementation of Soft Actor-Critic (SAC) [ArXiv].

If you use this code in your research project please cite us as:

@misc{pytorch_sac,
  author = {Yarats, Denis and Kostrikov, Ilya},
  title = {Soft Actor-Critic (SAC) implementation in PyTorch},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/denisyarats/pytorch_sac}},
}

Requirements

We assume you have access to a gpu that can run CUDA 9.2. Then, the simplest way to install all required dependencies is to create an anaconda environment and activate it:

conda env create -f conda_env.yml
source activate pytorch_sac

Instructions

To train an SAC agent on the cheetah run task run:

python train.py env=cheetah_run

This will produce exp folder, where all the outputs are going to be stored including train/eval logs, tensorboard blobs, and evaluation episode videos. One can attacha tensorboard to monitor training by running:

tensorboard --logdir exp

Results

An extensive benchmarking of SAC on the DM Control Suite against D4PG. We plot an average performance of SAC over 3 seeds together with p95 confidence intervals. Importantly, we keep the hyperparameters fixed across all the tasks. Note that results for D4PG are reported after 10^8 steps and taken from the original paper. Results

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].