All Projects → nikhilbarhate99 → Ppo Pytorch

nikhilbarhate99 / Ppo Pytorch

Licence: mit
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Ppo Pytorch

LWDRLC
Lightweight deep RL Libraray for continuous control.
Stars: ✭ 14 (-95.69%)
Mutual labels:  deep-reinforcement-learning, policy-gradient, ppo
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+12%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
imitation learning
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Stars: ✭ 93 (-71.38%)
Mutual labels:  deep-reinforcement-learning, policy-gradient, ppo
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (+96.92%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Deeprl algorithms
DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)
Stars: ✭ 97 (-70.15%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+824.31%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (-83.38%)
Mutual labels:  deep-reinforcement-learning, policy-gradient, ppo
Pytorch Rl
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Stars: ✭ 658 (+102.46%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+178.15%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Deep Reinforcement Learning With Pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Stars: ✭ 1,345 (+313.85%)
Mutual labels:  deep-reinforcement-learning, ppo, policy-gradient
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (-31.69%)
Mutual labels:  deep-reinforcement-learning, policy-gradient, ppo
Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020
Live Trading. Please star.
Stars: ✭ 1,251 (+284.92%)
Mutual labels:  deep-reinforcement-learning, ppo
Openai lab
An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.
Stars: ✭ 313 (-3.69%)
Mutual labels:  deep-reinforcement-learning, policy-gradient
Deep reinforcement learning course
Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch
Stars: ✭ 3,232 (+894.46%)
Mutual labels:  deep-reinforcement-learning, ppo
reinforcement learning ppo rnd
Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some explanation
Stars: ✭ 33 (-89.85%)
Mutual labels:  deep-reinforcement-learning, ppo
Deep-Reinforcement-Learning-CS285-Pytorch
Solutions of assignments of Deep Reinforcement Learning course presented by the University of California, Berkeley (CS285) in Pytorch framework
Stars: ✭ 104 (-68%)
Mutual labels:  deep-reinforcement-learning, policy-gradient
td-reg
TD-Regularized Actor-Critic Methods
Stars: ✭ 28 (-91.38%)
Mutual labels:  policy-gradient, ppo
DRL in CV
A course on Deep Reinforcement Learning in Computer Vision. Visit Website:
Stars: ✭ 59 (-81.85%)
Mutual labels:  deep-reinforcement-learning, policy-gradient
Reinforcement Learning
Deep Reinforcement Learning Algorithms implemented with Tensorflow 2.3
Stars: ✭ 61 (-81.23%)
Mutual labels:  policy-gradient, ppo
Deep-rl-mxnet
Mxnet implementation of Deep Reinforcement Learning papers, such as DQN, PG, DDPG, PPO
Stars: ✭ 26 (-92%)
Mutual labels:  deep-reinforcement-learning, policy-gradient

PPO-PyTorch

Minimal PyTorch implementation of Proximal Policy Optimization with clipped objective for OpenAI gym environments.

Usage

  • To test a preTrained network : run test.py or test_continuous.py
  • To train a new network : run PPO.py or PPO_continuous.py
  • All the hyperparameters are in the PPO.py or PPO_continuous.py file
  • If you are trying to train it on a environment where action dimension = 1, make sure to check the tensor dimensions in the update function of PPO class, since I have used torch.squeeze() quite a few times. torch.squeeze() squeezes the tensor such that there are no dimensions of length = 1 (more info).
  • Number of actors for collecting experience = 1. This could be changed by creating multiple instances of ActorCritic networks in the PPO class and using them to collect experience (like A3C and standard PPO).

Dependencies

Trained and tested on:

Python 3.6
PyTorch 1.0
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0

Results

PPO Discrete LunarLander-v2 (1200 episodes) PPO Continuous BipedalWalker-v2 (4000 episodes)

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].