Scitator / Run Skeleton Run
Licence: mit
Reason8.ai PyTorch solution for NIPS RL 2017 challenge
Stars: ✭ 83
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Run Skeleton Run
Deeprl Tensorflow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Stars: ✭ 319 (+284.34%)
Mutual labels: reinforcement-learning, ppo, ddpg, trpo
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (+180.72%)
Mutual labels: reinforcement-learning, ppo, actor-critic, ddpg
Reinforcement Learning With Tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
Stars: ✭ 6,948 (+8271.08%)
Mutual labels: reinforcement-learning, ppo, actor-critic, ddpg
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+167.47%)
Mutual labels: ddpg, actor-critic, trpo, ppo
Reinforcement Learning Algorithms
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)
Stars: ✭ 426 (+413.25%)
Mutual labels: ppo, actor-critic, ddpg, trpo
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (+108.43%)
Mutual labels: ppo, actor-critic, ddpg, trpo
Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Stars: ✭ 90 (+8.43%)
Mutual labels: reinforcement-learning, ppo, ddpg, trpo
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (+671.08%)
Mutual labels: reinforcement-learning, ppo, trpo
Pytorch Rl
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Stars: ✭ 658 (+692.77%)
Mutual labels: reinforcement-learning, ppo, trpo
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (+801.2%)
Mutual labels: reinforcement-learning, ppo, actor-critic
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+338.55%)
Mutual labels: reinforcement-learning, ppo, ddpg
Pytorch Cpp Rl
PyTorch C++ Reinforcement Learning
Stars: ✭ 353 (+325.3%)
Mutual labels: reinforcement-learning, ppo, actor-critic
Pytorch Rl
Deep Reinforcement Learning with pytorch & visdom
Stars: ✭ 745 (+797.59%)
Mutual labels: reinforcement-learning, actor-critic, trpo
Elegantrl
Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch.
Stars: ✭ 575 (+592.77%)
Mutual labels: reinforcement-learning, ppo, ddpg
Openai lab
An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.
Stars: ✭ 313 (+277.11%)
Mutual labels: reinforcement-learning, actor-critic, ddpg
Autonomous Learning Library
A PyTorch library for building deep reinforcement learning agents.
Stars: ✭ 425 (+412.05%)
Mutual labels: reinforcement-learning, ppo, ddpg
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (+432.53%)
Mutual labels: reinforcement-learning, ddpg, trpo
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+2371.08%)
Mutual labels: reinforcement-learning, ppo, ddpg
Pytorch A2c Ppo Acktr Gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Stars: ✭ 2,632 (+3071.08%)
Mutual labels: reinforcement-learning, ppo, actor-critic
Run-Skeleton-Run
Reason8.ai PyTorch solution for 3rd place NIPS RL 2017 challenge.
Additional thanks to Mikhail Pavlov for collaboration.
Agent policies
no-flip-state-action
flip-state-action
How to setup environment?
sh setup_conda.sh
source activate opensim-rl
Would like to test baselines? (Need MPI support)
-
sudo apt-get install openmpi-bin openmpi-doc libopenmpi-dev
3+.sh setup_env_mpi.sh
OR like DDPG agents?
3. sh setup_env.sh
- Congrats! Now you are ready to check our agents.
Run DDPG agent
CUDA_VISIBLE_DEVICES="" PYTHONPATH=. python ddpg/train.py \
--logdir ./logs_ddpg \
--num-threads 4 \
--ddpg-wrapper \
--skip-frames 5 \
--fail-reward -0.2 \
--reward-scale 10 \
--flip-state-action \
--actor-layers 64-64 --actor-layer-norm --actor-parameters-noise \
--actor-lr 0.001 --actor-lr-end 0.00001 \
--critic-layers 64-32 --critic-layer-norm \
--critic-lr 0.002 --critic-lr-end 0.00001 \
--initial-epsilon 0.5 --final-epsilon 0.001 \
--tau 0.0001
Evaluate DDPG agent
CUDA_VISIBLE_DEVICES="" PYTHONPATH=./ python ddpg/submit.py \
--restore-actor-from ./logs_ddpg/actor_state_dict.pkl \
--restore-critic-from ./logs_ddpg/critic_state_dict.pkl \
--restore-args-from ./logs_ddpg/args.json \
--num-episodes 10
Run TRPO/PPO agent
CUDA_VISIBLE_DEVICES="" PYTHONPATH=. python ddpg/train.py \
--agent ppo \
--logdir ./logs_baseline \
--baseline-wrapper \
--skip-frames 5 \
--fail-reward -0.2 \
--reward-scale 10
Citation
Please cite the following paper if you feel this repository useful.
@article{run_skeleton,
title={Run, skeleton, run: skeletal model in a physics-based simulation},
author = {Mikhail Pavlov, Sergey Kolesnikov and Sergey M.~Plis},
journal={AAAI Spring Symposium Series},
year={2018}
}
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].