All Projects → takuseno → Ppo

takuseno / Ppo

Licence: mit
Proximal Policy Optimization implementation with TensorFlow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Ppo

Reinforcement learning
Reinforcement learning tutorials
Stars: ✭ 82 (-10.87%)
Mutual labels:  reinforcement-learning
Mapleai
AI各领域学习资料整理。(A collection of all skills and knowledges should be got command of to obtain an AI relevant job offer. There are online blogs, my personal blogs, electronic books copy.)
Stars: ✭ 89 (-3.26%)
Mutual labels:  reinforcement-learning
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+19273.91%)
Mutual labels:  reinforcement-learning
Reinforcement Learning Wechat Jump
Reinforcement Learning for WeChat Jump
Stars: ✭ 85 (-7.61%)
Mutual labels:  reinforcement-learning
Magnet
MAGNet: Multi-agents control using Graph Neural Networks
Stars: ✭ 88 (-4.35%)
Mutual labels:  reinforcement-learning
Categorical Dqn
A working implementation of the Categorical DQN (Distributional RL).
Stars: ✭ 90 (-2.17%)
Mutual labels:  reinforcement-learning
Run Skeleton Run
Reason8.ai PyTorch solution for NIPS RL 2017 challenge
Stars: ✭ 83 (-9.78%)
Mutual labels:  reinforcement-learning
Magent
A Platform for Many-agent Reinforcement Learning
Stars: ✭ 1,306 (+1319.57%)
Mutual labels:  reinforcement-learning
Hand dapg
Repository to accompany RSS 2018 paper on dexterous hand manipulation
Stars: ✭ 88 (-4.35%)
Mutual labels:  reinforcement-learning
Grid2op
Grid2Op a testbed platform to model sequential decision making in power systems.
Stars: ✭ 91 (-1.09%)
Mutual labels:  reinforcement-learning
Simulator
A ROS/ROS2 Multi-robot Simulator for Autonomous Vehicles
Stars: ✭ 1,260 (+1269.57%)
Mutual labels:  reinforcement-learning
Stable Baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Stars: ✭ 1,263 (+1272.83%)
Mutual labels:  reinforcement-learning
Tnt
Simple tools for logging and visualizing, loading and training
Stars: ✭ 1,298 (+1310.87%)
Mutual labels:  reinforcement-learning
Maze
Maze Applied Reinforcement Learning Framework
Stars: ✭ 85 (-7.61%)
Mutual labels:  reinforcement-learning
Deep Learning Drizzle
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Stars: ✭ 9,717 (+10461.96%)
Mutual labels:  reinforcement-learning
Sc2aibot
Implementing reinforcement-learning algorithms for pysc2 -environment
Stars: ✭ 83 (-9.78%)
Mutual labels:  reinforcement-learning
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (-2.17%)
Mutual labels:  reinforcement-learning
60 days rl challenge
60_Days_RL_Challenge中文版
Stars: ✭ 92 (+0%)
Mutual labels:  reinforcement-learning
Cs234
My Solution to Assignments of CS234
Stars: ✭ 91 (-1.09%)
Mutual labels:  reinforcement-learning
Safeopt
Safe Bayesian Optimization
Stars: ✭ 90 (-2.17%)
Mutual labels:  reinforcement-learning

PPO

Proximal Policy Optimization implementation with Tensorflow.

https://arxiv.org/pdf/1707.06347.pdf

This repository has been much updated from commit id a4fbd383f0f89ce2d881a8b78d6b8a03294e5c7c . New PPO requires a new dependency, rlsaber which is my utility repository that can be shared across different algorithms.

Some of my design follow OpenAI baselines. But, I used as many default tensorflow packages as possible unlike baselines, that makes my codes easier to be read.

In addition, my PPO automatically switches between continuous action-space and discrete action-space depending on environments. If you want to change hyper parameters, check atari_constants.py or box_constants.py, which will be loaded depending on environments too.

requirements

  • Python3

dependencies

usage

training

$ python train.py [--env env-id] [--render] [--logdir log-name]

example

$ python train.py --env BreakoutNoFrameskip-v4 --logdir breakout

playing

$ python train.py --demo --load results/path-to-model [--env env-id] [--render]

example

$ python train.py --demo --load results/breakout/model.ckpt-xxxx --env BreakoutNoFrameskip-v4 --render

performance examples

Pendulumn-v0

image

BreakoutNoFrameskip-v4

image

implementation

This is inspired by following projects.

License

This repository is MIT-licensed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].