All Projects → MishaLaskin → Rad

MishaLaskin / Rad

RAD: Reinforcement Learning with Augmented Data

Projects that are alternatives of or similar to Rad

Hands On Reinforcement Learning With Python
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Stars: ✭ 640 (+138.81%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo, deep-learning-algorithms, deep-q-network
Deeprl Tutorials
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Stars: ✭ 748 (+179.1%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo, deep-q-network
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (-13.06%)
Mutual labels:  reinforcement-learning, deep-reinforcement-learning, ppo, rl, deep-q-network
Curl
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
Stars: ✭ 346 (+29.1%)
Mutual labels:  reinforcement-learning, deep-neural-networks, deep-reinforcement-learning, deep-learning-algorithms, deep-q-network
Deep Reinforcement Learning
Repo for the Deep Reinforcement Learning Nanodegree program
Stars: ✭ 4,012 (+1397.01%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Drq
DrQ: Data regularized Q
Stars: ✭ 268 (+0%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, rl
Rl Course Experiments
Stars: ✭ 73 (-72.76%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, deep-q-network
Reinforcement Learning
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Stars: ✭ 3,329 (+1142.16%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
Deep reinforcement learning course
Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch
Stars: ✭ 3,232 (+1105.97%)
Mutual labels:  jupyter-notebook, deep-reinforcement-learning, ppo, deep-q-network
Lagom
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Stars: ✭ 364 (+35.82%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, ppo
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (-36.94%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning, deep-q-network
My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (+338.43%)
Mutual labels:  jupyter-notebook, deep-neural-networks, deep-learning-algorithms
Intro To Deep Learning
A collection of materials to help you learn about deep learning
Stars: ✭ 103 (-61.57%)
Mutual labels:  jupyter-notebook, deep-neural-networks, deep-reinforcement-learning
Reinforcementlearning Atarigame
Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games
Stars: ✭ 118 (-55.97%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Tensorflow2.0 Examples
🙄 Difficult algorithm, Simple code.
Stars: ✭ 1,397 (+421.27%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-neural-networks
Advanced Deep Learning And Reinforcement Learning Deepmind
🎮 Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos 👉
Stars: ✭ 121 (-54.85%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (-53.73%)
Mutual labels:  jupyter-notebook, reinforcement-learning, deep-reinforcement-learning
Amazon Sagemaker Examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Stars: ✭ 6,346 (+2267.91%)
Mutual labels:  jupyter-notebook, reinforcement-learning, rl
Pytorch Rl
Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Stars: ✭ 121 (-54.85%)
Mutual labels:  jupyter-notebook, reinforcement-learning, rl
Deep Reinforcement Learning Algorithms
31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Stars: ✭ 167 (-37.69%)
Mutual labels:  jupyter-notebook, deep-reinforcement-learning, ppo

Reinforcement Learning with Augmented Data (RAD)

Official codebase for Reinforcement Learning with Augmented Data. This codebase was originally forked from CURL.

Additionally, here is the codebase link for ProcGen experiments.

BibTex

@unpublished{laskin_lee2020rad,
  title={Reinforcement Learning with Augmented Data},
  author={Laskin, Michael and Lee, Kimin and Stooke, Adam and Pinto, Lerrel and Abbeel, Pieter and Srinivas, Aravind},
  note={arXiv:2004.14990}
}

Installation

All of the dependencies are in the conda_env.yml file. They can be installed manually or with the following command:

conda env create -f conda_env.yml

Instructions

To train a RAD agent on the cartpole swingup task from image-based observations run bash script/run.sh from the root of this directory. The run.sh file contains the following command, which you can modify to try different environments / augmentations / hyperparamters.

CUDA_VISIBLE_DEVICES=0 python train.py \
    --domain_name cartpole \
    --task_name swingup \
    --encoder_type pixel --work_dir ./tmp/cartpole \
    --action_repeat 8 --num_eval_episodes 10 \
    --pre_transform_image_size 100 --image_size 84 \
    --agent rad_sac --frame_stack 3 --data_augs flip  \
    --seed 23 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 --num_train_steps 200000 &

Data Augmentations

Augmentations can be specified through the --data_augs flag. This codebase supports the augmentations specified in data_augs.py. To chain multiple data augmentation simply separate the augmentation strings with a - string. For example to apply crop -> rotate -> flip you can do the following --data_augs crop-rotate-flip.

All data augmentations can be visualized in All_Data_Augs.ipynb. You can also test the efficiency of our modules by running python data_aug.py.

Logging

In your console, you should see printouts that look like this:

| train | E: 13 | S: 2000 | D: 9.1 s | R: 48.3056 | BR: 0.8279 | A_LOSS: -3.6559 | CR_LOSS: 2.7563
| train | E: 17 | S: 2500 | D: 9.1 s | R: 146.5945 | BR: 0.9066 | A_LOSS: -5.8576 | CR_LOSS: 6.0176
| train | E: 21 | S: 3000 | D: 7.7 s | R: 138.7537 | BR: 1.0354 | A_LOSS: -7.8795 | CR_LOSS: 7.3928
| train | E: 25 | S: 3500 | D: 9.0 s | R: 181.5103 | BR: 1.0764 | A_LOSS: -10.9712 | CR_LOSS: 8.8753
| train | E: 29 | S: 4000 | D: 8.9 s | R: 240.6485 | BR: 1.2042 | A_LOSS: -13.8537 | CR_LOSS: 9.4001

The above output decodes as:

train - training episode
E - total number of episodes 
S - total number of environment steps
D - duration in seconds to train 1 episode
R - episode reward
BR - average reward of sampled batch
A_LOSS - average loss of actor
CR_LOSS - average loss of critic

All data related to the run is stored in the specified working_dir. To enable model or video saving, use the --save_model or --save_video flags. For all available flags, inspect train.py. To visualize progress with tensorboard run:

tensorboard --logdir log --port 6006

and go to localhost:6006 in your browser. If you're running headlessly, try port forwarding with ssh.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].