Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → denisyarats → Pytorch_sac

denisyarats / Pytorch_sac

Licence: mit

PyTorch implementation of Soft Actor-Critic (SAC)

Labels

jupyter-notebook deep-learning pytorch reinforcement-learning deep-reinforcement-learning gym actor-critic mujoco

Projects that are alternatives of or similar to Pytorch sac

Drq

DrQ: Data regularized Q

Stars: ✭ 268 (+54.02%)

Mutual labels: gym, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco

Pytorch sac ae

PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)

Stars: ✭ 94 (-45.98%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco

Pytorch A2c Ppo Acktr Gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+1412.64%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, mujoco

Rl algos

Reinforcement Learning Algorithms

Stars: ✭ 14 (-91.95%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning, actor-critic

Deeprl Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Stars: ✭ 748 (+329.89%)

Mutual labels: jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic

Reinforcement learning tutorial with demo

Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (+154.02%)

Mutual labels: jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic

Lagom

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

Stars: ✭ 364 (+109.2%)

Mutual labels: jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, mujoco

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-32.18%)

Mutual labels: jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, actor-critic

Pytorch Rl

This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch

Stars: ✭ 394 (+126.44%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning, mujoco

Rl Book

Source codes for the book "Reinforcement Learning: Theory and Python Implementation"

Stars: ✭ 464 (+166.67%)

Mutual labels: gym, jupyter-notebook, reinforcement-learning, deep-reinforcement-learning

Drlkit

A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms

Stars: ✭ 29 (-83.33%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning

Pytorch A3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+405.17%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Deterministic Gail Pytorch

PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning

Stars: ✭ 44 (-74.71%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning

Rl Course Experiments

Stars: ✭ 73 (-58.05%)

Mutual labels: jupyter-notebook, reinforcement-learning, deep-reinforcement-learning

Torch Ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Stars: ✭ 70 (-59.77%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Paac.pytorch

Pytorch implementation of the PAAC algorithm presented in Efficient Parallel Methods for Deep Reinforcement Learning https://arxiv.org/abs/1705.04862

Stars: ✭ 22 (-87.36%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning

Muzero General

MuZero

Stars: ✭ 1,187 (+582.18%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning

Rlenv.directory

Explore and find reinforcement learning environments in a list of 150+ open source environments.

Stars: ✭ 79 (-54.6%)

Mutual labels: gym, reinforcement-learning, deep-reinforcement-learning

Torchrl

Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)

Stars: ✭ 90 (-48.28%)

Mutual labels: gym, reinforcement-learning, mujoco

Pytorch Rl

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (-30.46%)

Mutual labels: jupyter-notebook, reinforcement-learning, actor-critic

View All Similar Projects ➔

Soft Actor-Critic (SAC) implementation in PyTorch

This is PyTorch implementation of Soft Actor-Critic (SAC) [ArXiv].

If you use this code in your research project please cite us as:

@misc{pytorch_sac,
  author = {Yarats, Denis and Kostrikov, Ilya},
  title = {Soft Actor-Critic (SAC) implementation in PyTorch},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/denisyarats/pytorch_sac}},
}

Requirements

We assume you have access to a gpu that can run CUDA 9.2. Then, the simplest way to install all required dependencies is to create an anaconda environment and activate it:

conda env create -f conda_env.yml
source activate pytorch_sac

Instructions

To train an SAC agent on the cheetah run task run:

python train.py env=cheetah_run

This will produce exp folder, where all the outputs are going to be stored including train/eval logs, tensorboard blobs, and evaluation episode videos. One can attacha tensorboard to monitor training by running:

tensorboard --logdir exp

Results

An extensive benchmarking of SAC on the DM Control Suite against D4PG. We plot an average performance of SAC over 3 seeds together with p95 confidence intervals. Importantly, we keep the hyperparameters fixed across all the tasks. Note that results for D4PG are reported after 10^8 steps and taken from the original paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 174

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗