All Projects → Maluuba → hra

Maluuba / hra

Licence: other
Hybrid Reward Architecture

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to hra

Stable Baselines
Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms
Stars: ✭ 115 (+51.32%)
Mutual labels:  rl
Rl Baselines3 Zoo
A collection of pre-trained RL agents using Stable Baselines3, training and hyperparameter optimization included.
Stars: ✭ 161 (+111.84%)
Mutual labels:  rl
Gymfc
A universal flight control tuning framework
Stars: ✭ 210 (+176.32%)
Mutual labels:  rl
Pytorch Rl
Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Stars: ✭ 121 (+59.21%)
Mutual labels:  rl
Cherry
A PyTorch Library for Reinforcement Learning Research
Stars: ✭ 143 (+88.16%)
Mutual labels:  rl
Atari
AI research environment for the Atari 2600 games 🤖.
Stars: ✭ 174 (+128.95%)
Mutual labels:  rl
Dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
Stars: ✭ 9,681 (+12638.16%)
Mutual labels:  rl
mujoco-benchmark
Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library
Stars: ✭ 101 (+32.89%)
Mutual labels:  rl
T 1000
⚡️ ⚡️ 𝘋𝘦𝘦𝘱 𝘙𝘓 𝘈𝘭𝘨𝘰𝘵𝘳𝘢𝘥𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘙𝘢𝘺 𝘈𝘗𝘐
Stars: ✭ 143 (+88.16%)
Mutual labels:  rl
Alphazero gomoku
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Stars: ✭ 2,570 (+3281.58%)
Mutual labels:  rl
Arel
Code for the ACL paper "No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling"
Stars: ✭ 124 (+63.16%)
Mutual labels:  rl
Real Time Ml Project
A curated list of applied machine learning and data science notebooks and libraries across different industries.
Stars: ✭ 143 (+88.16%)
Mutual labels:  rl
Rl trading
An environment to high-frequency trading agents under reinforcement learning
Stars: ✭ 205 (+169.74%)
Mutual labels:  rl
Ros2learn
ROS 2 enabled Machine Learning algorithms
Stars: ✭ 119 (+56.58%)
Mutual labels:  rl
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (+206.58%)
Mutual labels:  rl
Aws Robomaker Sample Application Deepracer
Use AWS RoboMaker and demonstrate running a simulation which trains a reinforcement learning (RL) model to drive a car around a track
Stars: ✭ 105 (+38.16%)
Mutual labels:  rl
Coach
Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
Stars: ✭ 2,085 (+2643.42%)
Mutual labels:  rl
Transformers-RL
An easy PyTorch implementation of "Stabilizing Transformers for Reinforcement Learning"
Stars: ✭ 107 (+40.79%)
Mutual labels:  rl
Learning To Communicate Pytorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Stars: ✭ 236 (+210.53%)
Mutual labels:  rl
Rl Tutorial Jnrr19
Stable-Baselines tutorial for Journées Nationales de la Recherche en Robotique 2019
Stars: ✭ 204 (+168.42%)
Mutual labels:  rl

Hybrid Reward Architecture

This repository hosts the code published along with the following NIPS article (Experiment 4.1: Fruit Collection Task):

For more information about this article, see the following blog posts:

Dependencies

We strongly suggest to use Anaconda distribution.

  • Python 3.5 or higher
  • pygame 1.9.2+ (pip install pygame)
  • click (pip install click)
  • numpy (pip install numpy -- or install Anaconda distribution)
  • Keras 1.2.0+, but less than 2.0 (pip install keras==1.2)
  • Theano or Tensorflow. The code is fully tested on Theano. (pip install theano)

Usage

While any run is going on, the results as well as the AI models will be saved in the ./results subfolder. For a complete run, five experiments for each method, use the following command (may take several hours depending on your machine):

./run.sh
  • NOTE: Because the state-shape is relatively small, the deep RL methods of this code run faster on CPU.

Alternatively, for a single run use the following commands:

  • Tabular GVF:
ipython ./tabular/train.py -- -o use_gvf True -o folder_name tabular_gvf_ -o nb_experiments 1
  • Tabular no-GVF:
ipython ./tabular/train.py -- -o use_gvf False -o folder_name tabular_no-gvf_ -o nb_experiments 1
  • DQN:
THEANO_FLAG="device=cpu" ipython ./dqn/train.py -- --mode hra+1 -o nb_experiments 1
  • --mode can be either of dqn, dqn+1, hra, hra+1, or all.

Demo

We have also provided the code to demo Tabular GVF/NO-GVF methods. You first need to train the model using one of the above commands (Tabular GVF or no-GVF) and then run the demo. For example,

ipython ./tabular/train.py -- -o use_gvf True -o folder_name tabular_gvf_ -o nb_experiments 1
ipython ./tabular/train.py -- --demo -o folder_name tabular_gvf_

If you would like to save the results, use the --save option:

ipython ./tabular/train.py -- --demo --save -o folder_name tabular_gvf_

The rendered images will be saved in ./render directory by default.

License

Please refer to LICENSE.txt.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].