All Projects → nnaisense → Max

nnaisense / Max

Code for reproducing experiments in Model-Based Active Exploration, ICML 2019

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Max

Pytorch A3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Stars: ✭ 879 (+1340.98%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Awesome Deep Rl
For deep RL and the future of AI.
Stars: ✭ 985 (+1514.75%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Drlkit
A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms
Stars: ✭ 29 (-52.46%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Slm Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Stars: ✭ 904 (+1381.97%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deterministic Gail Pytorch
PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning
Stars: ✭ 44 (-27.87%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Paac.pytorch
Pytorch implementation of the PAAC algorithm presented in Efficient Parallel Methods for Deep Reinforcement Learning https://arxiv.org/abs/1705.04862
Stars: ✭ 22 (-63.93%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
Stars: ✭ 980 (+1506.56%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (+1126.23%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deep Q Learning
Minimal Deep Q Learning (DQN & DDQN) implementations in Keras
Stars: ✭ 1,013 (+1560.66%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deepbootcamp
Solved lab problems, slides and notes of the Deep Reinforcement Learning bootcamp 2017 held at UCBerkeley
Stars: ✭ 39 (-36.07%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pygame Learning Environment
PyGame Learning Environment (PLE) -- Reinforcement Learning Environment in Python.
Stars: ✭ 828 (+1257.38%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Ml In Tf
Get started with Machine Learning in TensorFlow with a selection of good reads and implemented examples!
Stars: ✭ 45 (-26.23%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Btgym
Scalable, event-driven, deep-learning-friendly backtesting library
Stars: ✭ 765 (+1154.1%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Rl algos
Reinforcement Learning Algorithms
Stars: ✭ 14 (-77.05%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Osim Rl
Reinforcement learning environments with musculoskeletal models
Stars: ✭ 763 (+1150.82%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Left Shift
Using deep reinforcement learning to tackle the game 2048.
Stars: ✭ 35 (-42.62%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Pytorch Rl
Deep Reinforcement Learning with pytorch & visdom
Stars: ✭ 745 (+1121.31%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Tensorflow Tutorial
TensorFlow and Deep Learning Tutorials
Stars: ✭ 748 (+1126.23%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Deepqlearning.jl
Implementation of the Deep Q-learning algorithm to solve MDPs
Stars: ✭ 38 (-37.7%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning
Async Deeprl
Playing Atari games with TensorFlow implementation of Asynchronous Deep Q-Learning
Stars: ✭ 44 (-27.87%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning

Model-Based Active Exploration (MAX)

Code for reproducing experiments in Model-Based Active Exploration, ICML 2019

Written in PyTorch v1.0.

Code relies on sacred for managing experiments and hyper-parameters.

Overview:

  • envs/: contains the environments used.
  • main.py: contains the main algorithm and baselines through modes.
  • models.py: a fast parallel implementation of an ensemble of models which can are trained with negative log-likelihood loss.
  • utilities.py: contains the all the utilities (exploration objectives) used in the paper.
  • imagination.py: contains code that constructs a virtual MDP using the model ensemble.
  • sac.py: contains a simple Soft Actor-Critic implementation.
  • sacred_fetcher.py: script to download experiment artifacts stored in MongoDB.

Installation

  • Install required dependencies:

    sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf
    
  • Create conda environment with required dependencies:

    conda env create -f conda_env.yml
    
  • Download and setup MuJoCo binaries. The project uses mujoco and mujoco_py version 1.50.

    mkdir ~/.mujoco/
    cd .mujoco/
    wget -c https://www.roboti.us/download/mjpro150_linux.zip
    unzip mjpro150_linux.zip
    rm mjpro150_linux.zip
    

    Obtain MuJoCo license key and place it .mujoco/ directory created above with filename mjkey.txt.

  • Append the following to ~/.bashrc:

    # MuJoCo
    export LD_LIBRARY_PATH=:/home/<USER>/.mujoco/mjpro150/bin
    
    if [ -f /usr/lib/x86_64-linux-gnu/libGLEW.so ]; then    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<USER>/.mujoco/mjpro150/bin:/usr/lib/nvidia-390
        export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
        export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-375
    fi
    
    
  • Quick test of MuJoCo installation

    >>> import gym
    >>> gym.make('HalfCheetah-v2')
    

Commands

Execute the commands listed below from the code directory to reproduce the results.

Half Cheetah

  • MAX:
python main.py with max_explore env_noise_stdev=0.02
  • Trajectory Variance Active Exploration:
python main.py with max_explore utility_measure=traj_stdev policy_explore_alpha=0.2 env_noise_stdev=0.02
  • Renyi Divergence Reactive Exploration:
python main.py with max_explore exploration_mode=reactive env_noise_stdev=0.02
  • Prediction Error Reactive Exploration:
python main.py with max_explore exploration_mode=reactive utility_measure=pred_err policy_explore_alpha=0.2 env_noise_stdev=0.02
  • Random Exploration:
python main.py with random_explore env_noise_stdev=0.02

Ant

  • MAX:
python main.py with max_explore env_name=MagellanAnt-v2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True
  • Trajectory Variance Active Exploration:
python main.py with max_explore env_name=MagellanAnt-v2 utility_measure=traj_stdev policy_explore_alpha=0.2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True
  • Renyi Divergence Reactive Exploration:
python main.py with max_explore env_name=MagellanAnt-v2 exploration_mode=reactive env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True
  • Prediction Error Reactive Exploration:
python main.py with max_explore env_name=MagellanAnt-v2 exploration_mode=reactive utility_measure=pred_err policy_explore_alpha=0.2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True
  • Random Exploration:
python main.py with random_explore env_name=MagellanAnt-v2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Magellan

Magellan is the internal code name of the project inspired by life of Ferdinand Magellan.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].