Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

Stars: ✭ 50 (-49.49%)

Mutual labels: ppo

Pgdrive

PGDrive: an open-ended driving simulator with infinite scenes from procedural generation

Stars: ✭ 60 (-39.39%)

Mutual labels: imitation-learning

Run Skeleton Run

Reason8.ai PyTorch solution for NIPS RL 2017 challenge

Stars: ✭ 83 (-16.16%)

Mutual labels: ppo

Deeprl algorithms

DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)

Stars: ✭ 97 (-2.02%)

Mutual labels: ppo

Slm Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

Stars: ✭ 904 (+813.13%)

Mutual labels: ppo

Hand dapg

Repository to accompany RSS 2018 paper on dexterous hand manipulation

Stars: ✭ 88 (-11.11%)

Mutual labels: imitation-learning

Torch Ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

Stars: ✭ 70 (-29.29%)

Mutual labels: ppo

Dogtorch

Who Let The Dogs Out? Modeling Dog Behavior From Visual Data https://arxiv.org/pdf/1803.10827.pdf

Stars: ✭ 66 (-33.33%)

Mutual labels: imitation-learning

Mario rl

Stars: ✭ 60 (-39.39%)

Mutual labels: ppo

Sc2aibot

Implementing reinforcement-learning algorithms for pysc2 -environment

Stars: ✭ 83 (-16.16%)

Mutual labels: ppo

Learning2run

Our NIPS 2017: Learning to Run source code

Stars: ✭ 57 (-42.42%)

Mutual labels: ppo

Torchrl

Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)

Stars: ✭ 90 (-9.09%)

Mutual labels: ppo

Deterministic Gail Pytorch

PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning

Stars: ✭ 44 (-55.56%)

Mutual labels: imitation-learning

Imitation Learning

Imitation learning algorithms

Stars: ✭ 85 (-14.14%)

Mutual labels: imitation-learning

On Policy

This is the official implementation of Multi-Agent PPO.

Stars: ✭ 63 (-36.36%)

Mutual labels: ppo

Imitation Learning

Autonomous driving: Tensorflow implementation of the paper "End-to-end Driving via Conditional Imitation Learning"

Stars: ✭ 60 (-39.39%)

Mutual labels: imitation-learning

Inverse rl

Adversarial Imitation Via Variational Inverse Reinforcement Learning

Stars: ✭ 79 (-20.2%)

Mutual labels: imitation-learning

View All Similar Projects ➔

Generative Adversarial Imitation Learning

Implementation of Generative Adversarial Imitation Learning(GAIL) using tensorflow

Dependencies

python>=3.5
tensorflow>=1.4
gym>=0.9.3

Gym environment

Env==CartPole-v0
State==Continuous
Action==Discrete

Usage

Train experts

python3 run_ppo.py

Sample trajectory using expert

python3 sample_trajectory.py

Run GAIL

python3 run_gail.py

Run supervised learning

python3 run_behavior_clone.py

Test trained policy

python3 test_policy.py

Default policy is trained with gail
--alg=bc or ppo allows you to change test policy

If you want to test bc policy, specify the number of model.ckpt-number in the directory trained_models/bc
Example

python3 test_policy.py --alg=bc --model=1000

Tensorboard

tensorboard --logdir=log

Results


Fig.1 Training results	legend

LICENSE

MIT LICENSE

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 99

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗