Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → nikhilbarhate99 → Ppo Pytorch

nikhilbarhate99 / Ppo Pytorch

Licence: mit

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch deep-reinforcement-learning pytorch-tutorial ppo policy-gradient

Projects that are alternatives of or similar to Ppo Pytorch

LWDRLC

Lightweight deep RL Libraray for continuous control.

Stars: ✭ 14 (-95.69%)

Mutual labels: deep-reinforcement-learning, policy-gradient, ppo

Lagom

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

Stars: ✭ 364 (+12%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

imitation learning

PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.

Stars: ✭ 93 (-71.38%)

Mutual labels: deep-reinforcement-learning, policy-gradient, ppo

Hands On Reinforcement Learning With Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

Stars: ✭ 640 (+96.92%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Deeprl algorithms

DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)

Stars: ✭ 97 (-70.15%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Easy Rl

强化学习中文教程，在线阅读地址：https://datawhalechina.github.io/easy-rl/

Stars: ✭ 3,004 (+824.31%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Explorer

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Stars: ✭ 54 (-83.38%)

Mutual labels: deep-reinforcement-learning, policy-gradient, ppo

Pytorch Rl

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Stars: ✭ 658 (+102.46%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+178.15%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Deep Reinforcement Learning With Pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Stars: ✭ 1,345 (+313.85%)

Mutual labels: deep-reinforcement-learning, ppo, policy-gradient

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (-31.69%)

Mutual labels: deep-reinforcement-learning, policy-gradient, ppo

Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

Live Trading. Please star.

Stars: ✭ 1,251 (+284.92%)

Mutual labels: deep-reinforcement-learning, ppo

Openai lab

An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.

Stars: ✭ 313 (-3.69%)

Mutual labels: deep-reinforcement-learning, policy-gradient

Deep reinforcement learning course

Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch

Stars: ✭ 3,232 (+894.46%)

Mutual labels: deep-reinforcement-learning, ppo

reinforcement learning ppo rnd

Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some explanation

Stars: ✭ 33 (-89.85%)

Mutual labels: deep-reinforcement-learning, ppo

Deep-Reinforcement-Learning-CS285-Pytorch

Solutions of assignments of Deep Reinforcement Learning course presented by the University of California, Berkeley (CS285) in Pytorch framework

Stars: ✭ 104 (-68%)

Mutual labels: deep-reinforcement-learning, policy-gradient

td-reg

TD-Regularized Actor-Critic Methods

Stars: ✭ 28 (-91.38%)

Mutual labels: policy-gradient, ppo

DRL in CV

A course on Deep Reinforcement Learning in Computer Vision. Visit Website:

Stars: ✭ 59 (-81.85%)

Mutual labels: deep-reinforcement-learning, policy-gradient

Reinforcement Learning

Deep Reinforcement Learning Algorithms implemented with Tensorflow 2.3

Stars: ✭ 61 (-81.23%)

Mutual labels: policy-gradient, ppo

Deep-rl-mxnet

Mxnet implementation of Deep Reinforcement Learning papers, such as DQN, PG, DDPG, PPO

Stars: ✭ 26 (-92%)

Mutual labels: deep-reinforcement-learning, policy-gradient

View All Similar Projects ➔

PPO-PyTorch

Minimal PyTorch implementation of Proximal Policy Optimization with clipped objective for OpenAI gym environments.

Usage

To test a preTrained network : run test.py or test_continuous.py
To train a new network : run PPO.py or PPO_continuous.py
All the hyperparameters are in the PPO.py or PPO_continuous.py file
If you are trying to train it on a environment where action dimension = 1, make sure to check the tensor dimensions in the update function of PPO class, since I have used torch.squeeze() quite a few times. torch.squeeze() squeezes the tensor such that there are no dimensions of length = 1 (more info).
Number of actors for collecting experience = 1. This could be changed by creating multiple instances of ActorCritic networks in the PPO class and using them to collect experience (like A3C and standard PPO).

Dependencies

Trained and tested on:

Python 3.6
PyTorch 1.0
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0

Results

PPO Discrete LunarLander-v2 (1200 episodes)	PPO Continuous BipedalWalker-v2 (4000 episodes)

References

PPO paper
OpenAI Spinning up

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 325

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗