All Projects → pathak22 → Noreward Rl

pathak22 / Noreward Rl

Licence: other
[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Noreward Rl

Rad
RAD: Reinforcement Learning with Augmented Data
Stars: ✭ 268 (-77.21%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning, rl
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (-62.41%)
Mutual labels:  deep-reinforcement-learning, openai-gym, rl
Rl Book
Source codes for the book "Reinforcement Learning: Theory and Python Implementation"
Stars: ✭ 464 (-60.54%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Rl a3c pytorch
A3C LSTM Atari with Pytorch plus A3G design
Stars: ✭ 482 (-59.01%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Trending Deep Learning
Top 100 trending deep learning repositories sorted by the number of stars gained on a specific day.
Stars: ✭ 543 (-53.83%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning
Detection 2016 Nipsws
Hierarchical Object Detection with Deep Reinforcement Learning
Stars: ✭ 401 (-65.9%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning
Rl Portfolio Management
Attempting to replicate "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem" https://arxiv.org/abs/1706.10059 (and an openai gym environment)
Stars: ✭ 447 (-61.99%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Deep Reinforcement Learning For Automated Stock Trading Ensemble Strategy Icaif 2020
Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. ICAIF 2020. Please star.
Stars: ✭ 518 (-55.95%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Pytorch Ddpg
Implementation of the Deep Deterministic Policy Gradient (DDPG) using PyTorch
Stars: ✭ 272 (-76.87%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Btgym
Scalable, event-driven, deep-learning-friendly backtesting library
Stars: ✭ 765 (-34.95%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Softlearning
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
Stars: ✭ 713 (-39.37%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning
Rl Baselines Zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Stars: ✭ 839 (-28.66%)
Mutual labels:  openai-gym, rl
Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (-66.5%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+241.16%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Curl
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
Stars: ✭ 346 (-70.58%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning
Awesome Deep Trading
List of awesome resources for machine learning-based algorithmic trading
Stars: ✭ 514 (-56.29%)
Mutual labels:  deep-neural-networks, deep-reinforcement-learning
Deterministic Gail Pytorch
PyTorch implementation of Deterministic Generative Adversarial Imitation Learning (GAIL) for Off Policy learning
Stars: ✭ 44 (-96.26%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Drq
DrQ: Data regularized Q
Stars: ✭ 268 (-77.21%)
Mutual labels:  deep-reinforcement-learning, rl
Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (-45.58%)
Mutual labels:  deep-reinforcement-learning, openai-gym
Rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
Stars: ✭ 980 (-16.67%)
Mutual labels:  deep-reinforcement-learning, openai-gym

Curiosity-driven Exploration by Self-supervised Prediction

In ICML 2017 [Project Website] [Demo Video]

Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell
University of California, Berkeley

This is a tensorflow based implementation for our ICML 2017 paper on curiosity-driven exploration for reinforcement learning. Idea is to train agent with intrinsic curiosity-based motivation (ICM) when external rewards from environment are sparse. Surprisingly, you can use ICM even when there are no rewards available from the environment, in which case, agent learns to explore only out of curiosity: 'RL without rewards'. If you find this work useful in your research, please cite:

@inproceedings{pathakICMl17curiosity,
    Author = {Pathak, Deepak and Agrawal, Pulkit and
              Efros, Alexei A. and Darrell, Trevor},
    Title = {Curiosity-driven Exploration by Self-supervised Prediction},
    Booktitle = {International Conference on Machine Learning ({ICML})},
    Year = {2017}
}

1) Installation and Usage

  1. This code is based on TensorFlow. To install, run these commands:
# you might not need many of these, e.g., fceux is only for mario
sudo apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb \
libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig python3-dev \
python3-venv make golang libjpeg-turbo8-dev gcc wget unzip git fceux virtualenv \
tmux

# install the code
git clone -b master --single-branch https://github.com/pathak22/noreward-rl.git
cd noreward-rl/
virtualenv curiosity
source $PWD/curiosity/bin/activate
pip install numpy
pip install -r src/requirements.txt
python curiosity/src/go-vncdriver/build.py

# download models
bash models/download_models.sh

# setup customized doom environment
cd doomFiles/
# then follow commands in doomFiles/README.md
  1. Running demo
cd noreward-rl/src/
python demo.py --ckpt ../models/doom/doom_ICM
python demo.py --env-id SuperMarioBros-1-1-v0 --ckpt ../models/mario/mario_ICM
  1. Training code
cd noreward-rl/src/
# For Doom: doom or doomSparse or doomVerySparse
python train.py --default --env-id doom

# For Mario, change src/constants.py as follows:
# PREDICTION_BETA = 0.2
# ENTROPY_BETA = 0.0005
python train.py --default --env-id mario --noReward

xvfb-run -s "-screen 0 1400x900x24" bash  # only for remote desktops
# useful xvfb link: http://stackoverflow.com/a/30336424
python inference.py --default --env-id doom --record

2) Other helpful pointers

3) Acknowledgement

Vanilla A3C code is based on the open source implementation of universe-starter-agent.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].