All Projects → reinforcement-learning-kr → Lets Do Irl

reinforcement-learning-kr / Lets Do Irl

Inverse RL algorithms (APP, MaxEnt, GAIL, VAIL)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Lets Do Irl

EasyAES
AES encrypt/decrypt, Android, iOS, php compatible(兼容php, Android, iOS平台)
Stars: ✭ 79 (-69.02%)
Mutual labels:  app
hms-ecommerce-demo
Build a shopping app by HMS Core kits which shows how to use the HMS Core solution in E-Commerce industry.
Stars: ✭ 70 (-72.55%)
Mutual labels:  app
IT-Security-Quiz-App
IT Security Quiz App for Android. Project for Android Basics: User Input course by Udacity
Stars: ✭ 15 (-94.12%)
Mutual labels:  app
nimblenote
Simple keyboard-driven note taking application for macOS, Linux and Windows.
Stars: ✭ 31 (-87.84%)
Mutual labels:  app
bbb app
Our attempt at a mobile app client for BigBlueButton services.
Stars: ✭ 83 (-67.45%)
Mutual labels:  app
IndianComedyapp
Indian Comedy shows.This app contains all Indian famous comedy videos.you can download and watch all your favorite comedy videos .the database used here is mysqli and for connection I used php.this app also contains a funny comedy timeline like fb. Used Json and Volley library for Loading datin in a very fast way also provides Cache during offil…
Stars: ✭ 13 (-94.9%)
Mutual labels:  app
Best App
收集&推荐优秀的 Apps/硬件/技巧/周边等
Stars: ✭ 15,231 (+5872.94%)
Mutual labels:  app
ScreenHelper
android 屏幕适配的终极方案: SmallestWidth适配修改和 DisplayMetrics.densityDpi属性, 完美兼容 AndroidX 和 Android 库 ^_^ ,欢迎使用~~
Stars: ✭ 36 (-85.88%)
Mutual labels:  app
11tyby
Simple 11ty setup using TypeScript, SASS, Preact with partial hydration, and other useful things. Aims to provide the DX of Gatsby, but using 11ty!
Stars: ✭ 38 (-85.1%)
Mutual labels:  app
taro-playground
The Taro Playground App is a cross-platform application developed using Taro, to help developers develop and debug Taro applications.
Stars: ✭ 33 (-87.06%)
Mutual labels:  app
xcloud-shield
Xcloud Beta Unofficial App for the Nvidia Shield Android TV. Playing Xbox Cloud Gaming directly on the box Nvidia Shield tv in the best way.
Stars: ✭ 93 (-63.53%)
Mutual labels:  app
lacmus-app
lacmus-app
Stars: ✭ 34 (-86.67%)
Mutual labels:  app
rewind-docs
垃圾应用主页
Stars: ✭ 73 (-71.37%)
Mutual labels:  app
currency-converter
💰 Easily convert between 32 currencies
Stars: ✭ 16 (-93.73%)
Mutual labels:  app
ProxerAndroid
The official Android App of Proxer.Me
Stars: ✭ 105 (-58.82%)
Mutual labels:  app
Questions
A modular iOS quiz app
Stars: ✭ 108 (-57.65%)
Mutual labels:  app
smart-home-neomorphism-app
Smart Home app ui design in a neomorphism style.
Stars: ✭ 82 (-67.84%)
Mutual labels:  app
prometheus
A flask web app that analyzes your stock portfolio performance, optimizes your asset allocation, and provides performance enhancement alerts.
Stars: ✭ 26 (-89.8%)
Mutual labels:  app
jyutping
Cantonese Jyutping Keyboard for iOS. 粵語粵拼輸入法鍵盤
Stars: ✭ 23 (-90.98%)
Mutual labels:  app
CSwala-android
An app that is a one-stop destination for all the CS enthusiasts, providing resources like Information scrapping techniques, best YT channels, courses available free-of-cost, etc. & knowledge about every domain and field that exists on the Internet related to Computer Science along with News, Jobs, and Internships opportunities in these domains …
Stars: ✭ 44 (-82.75%)
Mutual labels:  app

Let's do Inverse RL

image

Introduction

This repository contains PyTorch (v0.4.1) implementations of Inverse Reinforcement Learning (IRL) algorithms.

  • Apprenticeship Learning via Inverse Reinforcement Learning [2]
  • Maximum Entropy Inverse Reinforcement Learning [4]
  • Generative Adversarial Imitation Learning [5]
  • Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow [6]

We have implemented and trained the agents with the IRL algorithms using the following environments.

For reference, reviews of below papers related to IRL (in Korean) are located in Let's do Inverse RL Guide.

Table of Contents

Mountain car

We have implemented APP, MaxEnt using Q-learning as RL step in MountainCar-v0 environment.

1. Information

2. Expert's demonstrations

Navigate to expert_demo.npy in lets-do-irl/mountaincar/app/expert_demo or lets-do-irl/mountaincar/maxent/expert_demo.

Shape of expert's demonstrations is (20, 130, 3); (number of demonstrations, length of demonstrations, states and actions of demonstrations)

If you make demonstrations, Navigate to make_expert.py in lets-do-irl/mountaincar/app/expert_demo or lets-do-irl/mountaincar/maxent/expert_demo.

3. Train & Test

APP

Navigate to lets-do-irl/mountaincar/app folder.

Train the agent wtih APP without rendering.

python train.py

If you want to test APP, Test the agent with the saved model app_q_table.npy in app/results folder.

python test.py

MaxEnt

Navigate to lets-do-irl/mountaincar/maxent folder.

Train the agent wtih MaxEnt without rendering.

python train.py

If you want to test MaxEnt, Test the agent with the saved model maxent_q_table.npy in maxent/results folder.

python test.py

4. Trained Agent

We have trained the agents with two different IRL algortihms using MountainCar-v0 environment.

Algorithms Scores / Episodes GIF
APP app
MaxEnt maxent

Mujoco Hopper

We have implemented GAIL, VAIL using PPO as RL step in Hopper-v2 environment.

1. Installation

2. Expert's demonstrations

Navigate to expert_demo.p in lets-do-irl/mujoco/gail/expert_demo or lets-do-irl/mujoco/vail/expert_demo.

Shape of expert's demonstrations is (50000, 14); (number of demonstrations, states and actions of demonstrations)

We used demonstrations that get scores between about 2200 and 2600 on average.

If you want to make demonstrations, Navigate to main.py in lets-do-irl/mojoco/ppo folder.

Also, you can see detailed implementation story (in Korean) of PPO in PG Travel implementation story.

3. Train & Test

GAIL

Navigate to lets-do-irl/mujoco/gail folder.

Train the agent wtih GAIL without rendering.

python main.py

If you want to Continue training from the saved checkpoint,

python main.py --load_model ckpt_4000_gail.pth.tar
  • Note that ckpt_4000_gail.pth.tar file should be in the mujoco/gail/save_model folder.

If you want to test GAIL, Test the agent with the saved model ckpt_4000_gail.pth.tar in the mujoco/gail/save_model folder.

python test.py --load_model ckpt_4000_gail.pth.tar
  • Note that ckpt_4000_gail.pth.tar file should be in the mujoco/gail/save_model folder.

VAIL

Navigate to lets-do-irl/mujoco/vail folder.

Train the agent wtih VAIL without rendering.

python main.py

If you want to Continue training from the saved checkpoint,

python main.py --load_model ckpt_4000_vail.pth.tar
  • Note that ckpt_4000_vail.pth.tar file should be in the mujoco/vail/save_model folder.

If you want to test VAIL, Test the agent with the saved model ckpt_4000_vail.pth.tar in the mujoco/vail/save_model folder.

python test.py --load_model ckpt_4000_vail.pth.tar
  • Note that ckpt_4000_vail.pth.tar file should be in the mujoco/vail/save_model folder.

4. Tensorboard

Note that the results of trainings are automatically saved in logs folder. TensorboardX is the Tensorboard-like visualization tool for Pytorch.

Navigate to the lets-do-irl/mujoco/gail or lets-do-irl/mujoco/vail folder.

tensorboard --logdir logs

5. Trained Agent

We have trained the agents with two different IRL algortihms using Hopper-v2 environment.

Algorithms Scores / Iterations (total sample size : 2048)
PPO (to compare) ppo
GAIL gail
VAIL vail
Total total

Reference

We referenced the codes from below repositories.

Implementation team members

Dongmin Lee (project manager) : Github, Facebook

Seungje Yoon : Github, Facebook

Seunghyun Lee : Github, Facebook

Geonhee Lee : Github, Facebook

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].