Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lcswillems → Torch Ac

lcswillems / Torch Ac

Licence: mit

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch reinforcement-learning deep-reinforcement-learning recurrent-neural-networks ppo actor-critic a3c

Projects that are alternatives of or similar to Torch Ac

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (+68.57%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Pytorch A3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Stars: ✭ 879 (+1155.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Stars: ✭ 2,051 (+2830%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, a3c

Rl a3c pytorch

A3C LSTM Atari with Pytorch plus A3G design

Stars: ✭ 482 (+588.57%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Deeprl Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Stars: ✭ 748 (+968.57%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, actor-critic

Easy Rl

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+4191.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, a3c

Pytorch Rl

Deep Reinforcement Learning with pytorch & visdom

Stars: ✭ 745 (+964.29%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+1821.43%)

Mutual labels: deep-reinforcement-learning, ppo, actor-critic, a3c

Reinforcement Learning With Tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Stars: ✭ 6,948 (+9825.71%)

Mutual labels: reinforcement-learning, ppo, actor-critic, a3c

Reinforcement Learning

Minimal and Clean Reinforcement Learning Examples

Stars: ✭ 2,863 (+3990%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Reinforcement learning tutorial with demo

Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..

Stars: ✭ 442 (+531.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic, a3c

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+1191.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, a3c

Pytorch A2c Ppo Acktr Gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+3660%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, actor-critic

Pytorch Drl

PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.

Stars: ✭ 233 (+232.86%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, actor-critic

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+217.14%)

Mutual labels: deep-reinforcement-learning, a3c, actor-critic, ppo

Deeprl Tensorflow2

🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2

Stars: ✭ 319 (+355.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo, a3c

Tensorflow Reinforce

Implementations of Reinforcement Learning Models in Tensorflow

Stars: ✭ 480 (+585.71%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

Elegantrl

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.

Stars: ✭ 575 (+721.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo

Autonomous Learning Library

A PyTorch library for building deep reinforcement learning agents.

Stars: ✭ 425 (+507.14%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, ppo

Dissecting Reinforcement Learning

Python code, PDFs and resources for the series of posts on Reinforcement Learning which I published on my personal blog

Stars: ✭ 512 (+631.43%)

Mutual labels: reinforcement-learning, deep-reinforcement-learning, actor-critic

View All Similar Projects ➔

PyTorch Actor-Critic deep reinforcement learning algorithms: A2C and PPO

The torch_ac package contains the PyTorch implementation of two Actor-Critic deep reinforcement learning algorithms:

Note: An example of use of this package is given in the rl-starter-files repository. More details below.

Features

Recurrent policies
Reward shaping
Handle observation spaces that are tensors or dict of tensors
Handle discrete action spaces
Observation preprocessing
Multiprocessing
CUDA

Installation

pip3 install torch-ac

Note: If you want to modify torch-ac algorithms, you will need to rather install a cloned version, i.e.:

git clone https://github.com/lcswillems/torch-ac.git
cd torch-ac
pip3 install -e .

Package components overview

A brief overview of the components of the package:

torch_ac.A2CAlgo and torch_ac.PPOAlgo classes for A2C and PPO algorithms
torch_ac.ACModel and torch_ac.RecurrentACModel abstract classes for non-recurrent and recurrent actor-critic models
torch_ac.DictList class for making dictionnaries of lists list-indexable and hence batch-friendly

Package components details

Here are detailled the most important components of the package.

torch_ac.A2CAlgo and torch_ac.PPOAlgo have 2 methods:

__init__ that may take, among the other parameters:
- an acmodel actor-critic model, i.e. an instance of a class inheriting from either torch_ac.ACModel or torch_ac.RecurrentACModel.
- a preprocess_obss function that transforms a list of observations into a list-indexable object X (e.g. a PyTorch tensor). The default preprocess_obss function converts observations into a PyTorch tensor.
- a reshape_reward function that takes into parameter an observation obs, the action action taken, the reward reward received and the terminal status done and returns a new reward. By default, the reward is not reshaped.
- a recurrence number to specify over how many timesteps gradient is backpropagated. This number is only taken into account if a recurrent model is used and must divide the num_frames_per_agent parameter and, for PPO, the batch_size parameter.
update_parameters that first collects experiences, then update the parameters and finally returns logs.

torch_ac.ACModel has 2 abstract methods:

__init__ that takes into parameter an observation_space and an action_space.
forward that takes into parameter N preprocessed observations obs and returns a PyTorch distribution dist and a tensor of values value. The tensor of values must be of size N, not N x 1.

torch_ac.RecurrentACModel has 3 abstract methods:

__init__ that takes into parameter the same parameters than torch_ac.ACModel.
forward that takes into parameter the same parameters than torch_ac.ACModel along with a tensor of N memories memory of size N x M where M is the size of a memory. It returns the same thing than torch_ac.ACModel plus a tensor of N memories memory.
memory_size that returns the size M of a memory.

Note: The preprocess_obss function must return a list-indexable object (e.g. a PyTorch tensor). If your observations are dictionnaries, your preprocess_obss function may first convert a list of dictionnaries into a dictionnary of lists and then make it list-indexable using the torch_ac.DictList class as follow:

>>> d = DictList({"a": [[1, 2], [3, 4]], "b": [[5], [6]]})
>>> d.a
[[1, 2], [3, 4]]
>>> d[0]
DictList({"a": [1, 2], "b": [5]})

Note: if you use a RNN, you will need to set batch_first to True.

Examples

Examples of use of the package components are given in the rl-starter-scripts repository.

Example of use of `torch_ac.A2CAlgo` and `torch_ac.PPOAlgo`

...

algo = torch_ac.PPOAlgo(envs, acmodel, args.frames_per_proc, args.discount, args.lr, args.gae_lambda,
                        args.entropy_coef, args.value_loss_coef, args.max_grad_norm, args.recurrence,
                        args.optim_eps, args.clip_eps, args.epochs, args.batch_size, preprocess_obss)

...

exps, logs1 = algo.collect_experiences()
logs2 = algo.update_parameters(exps)

More details here.

Example of use of `torch_ac.DictList`

torch_ac.DictList({
    "image": preprocess_images([obs["image"] for obs in obss], device=device),
    "text": preprocess_texts([obs["mission"] for obs in obss], vocab, device=device)
})

More details here.

Example of implementation of `torch_ac.RecurrentACModel`

class ACModel(nn.Module, torch_ac.RecurrentACModel):
    ...

    def forward(self, obs, memory):
        ...

        return dist, value, memory

More details here.

Examples of `preprocess_obss` functions

More details here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 70

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

lcswillems / Torch Ac

Programming Languages

Labels

Projects that are alternatives of or similar to Torch Ac

PyTorch Actor-Critic deep reinforcement learning algorithms: A2C and PPO

Features

Installation

Package components overview

Package components details

Examples

Example of use of torch_ac.A2CAlgo and torch_ac.PPOAlgo

Example of use of torch_ac.DictList

Example of implementation of torch_ac.RecurrentACModel

Examples of preprocess_obss functions

Example of use of `torch_ac.A2CAlgo` and `torch_ac.PPOAlgo`

Example of use of `torch_ac.DictList`

Example of implementation of `torch_ac.RecurrentACModel`

Examples of `preprocess_obss` functions