All Projects → openai → Train Procgen

openai / Train Procgen

Licence: mit
Code for the paper "Leveraging Procedural Generation to Benchmark Reinforcement Learning"

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Train Procgen

Sparsely Grouped Gan
Code for paper "Sparsely Grouped Multi-task Generative Adversarial Networks for Facial Attribute Manipulation"
Stars: ✭ 68 (-26.88%)
Mutual labels:  paper
Ai Reading Materials
Some of the ML and DL related reading materials, research papers that I've read
Stars: ✭ 79 (-15.05%)
Mutual labels:  paper
Snapaper
📰 Past Papers Sharing Platform Based On Vue.js & GCE Guide | CAIE 试卷分享与下载平台
Stars: ✭ 90 (-3.23%)
Mutual labels:  paper
Codegan
[Deprecated] Source Code Generation using Sequence Generative Adversarial Networks
Stars: ✭ 73 (-21.51%)
Mutual labels:  paper
Minecraft Optimization
Minecraft server optimization guide
Stars: ✭ 77 (-17.2%)
Mutual labels:  paper
Delora
Self-supervised Deep LiDAR Odometry for Robotic Applications
Stars: ✭ 81 (-12.9%)
Mutual labels:  paper
Distributedsystems
My Distributed Systems references
Stars: ✭ 67 (-27.96%)
Mutual labels:  paper
Awesome Computer Vision
Awesome Resources for Advanced Computer Vision Topics
Stars: ✭ 92 (-1.08%)
Mutual labels:  paper
Psychics
Minecraft psychic plugin for Paper
Stars: ✭ 79 (-15.05%)
Mutual labels:  paper
Autonomous Drone
This repository intends to enable autonomous drone delivery with the Intel Aero RTF drone and PX4 autopilot. The code can be executed both on the real drone or simulated on a PC using Gazebo. Its core is a robot operating system (ROS) node, which communicates with the PX4 autopilot through mavros. It uses SVO 2.0 for visual odometry, WhyCon for visual marker localization and Ewok for trajectoy planning with collision avoidance.
Stars: ✭ 87 (-6.45%)
Mutual labels:  paper
Awesome System For Machine Learning
A curated list of research in machine learning system. I also summarize some papers if I think they are really interesting.
Stars: ✭ 1,185 (+1174.19%)
Mutual labels:  paper
Acm Icpc Resource
ACM-ICPC-resource chinese
Stars: ✭ 76 (-18.28%)
Mutual labels:  paper
Neural Mmo
Code for the paper "Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents"
Stars: ✭ 1,265 (+1260.22%)
Mutual labels:  paper
Snipit
Snipit allows you to capture and save interesting sections from any source of information. Be it textbooks, journals, computer screens, photographs, flyers, writings on a whiteboard, etc.
Stars: ✭ 70 (-24.73%)
Mutual labels:  paper
C2ae Multilabel Classification
Tensorflow implementation for the paper 'Learning Deep Latent Spaces for Multi-Label Classfications' in AAAI 2017
Stars: ✭ 90 (-3.23%)
Mutual labels:  paper
Nlp Paper
自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Stars: ✭ 67 (-27.96%)
Mutual labels:  paper
Recursive Cnns
Implementation of my paper "Real-time Document Localization in Natural Images by Recursive Application of a CNN."
Stars: ✭ 80 (-13.98%)
Mutual labels:  paper
Deeplpf
Code for CVPR 2020 paper "Deep Local Parametric Filters for Image Enhancement"
Stars: ✭ 91 (-2.15%)
Mutual labels:  paper
Core50
CORe50: a new Dataset and Benchmark for Continual Learning
Stars: ✭ 91 (-2.15%)
Mutual labels:  paper
Bit Rnn
Quantize weights and activations in Recurrent Neural Networks.
Stars: ✭ 86 (-7.53%)
Mutual labels:  paper

Status: Archive (code is provided as-is, no updates expected)

Leveraging Procedural Generation to Benchmark Reinforcement Learning

[Blog Post] [Paper]

This is code for training agents for some of the experiments in Leveraging Procedural Generation to Benchmark Reinforcement Learning (citation). The code for the environments is in the Procgen Benchmark repo.

We're currently running a competition which uses these environments to measure sample efficiency and generalization in RL. You can learn more and register here.

Supported platforms:

  • macOS 10.14 (Mojave)
  • Ubuntu 16.04

Supported Pythons:

  • 3.7 64-bit

Install

You can get miniconda from https://docs.conda.io/en/latest/miniconda.html if you don't have it, or install the dependencies from environment.yml manually.

git clone https://github.com/openai/train-procgen.git
conda env update --name train-procgen --file train-procgen/environment.yml
conda activate train-procgen
pip install https://github.com/openai/baselines/archive/9ee399f5b20cd70ac0a871927a6cf043b478193f.zip
pip install -e train-procgen

Try it out

Train an agent using PPO on the environment StarPilot:

python -m train_procgen.train --env_name starpilot

Train an agent using PPO on the environment StarPilot using the easy difficulty:

python -m train_procgen.train --env_name starpilot --distribution_mode easy

Run parallel training using MPI:

mpiexec -np 8 python -m train_procgen.train --env_name starpilot

Train an agent on a fixed set of N levels:

python -m train_procgen.train --env_name starpilot --num_levels N

Train an agent on the same 500 levels used in the paper:

python -m train_procgen.train --env_name starpilot --num_levels 500

Train an agent on a different set of 500 levels:

python -m train_procgen.train --env_name starpilot --num_levels 500 --start_level 1000

Run simultaneous training and testing using MPI. 1 in every 4 workers will be test workers, and the rest will be training workers.

mpiexec -np 8 python -m train_procgen.train --env_name starpilot --num_levels 500 --test_worker_interval 4

Train an agent using PPO on a level in Jumper that requires hard exploration

python -m train_procgen.train --env_name jumper --distribution_mode exploration

Train an agent using PPO on a variant of CaveFlyer that requires memory

python -m train_procgen.train --env_name caveflyer --distribution_mode memory

View training options:

python -m train_procgen.train --help

Reproduce and Visualize Results

Sample efficiency on hard environments (results/hard-all-runN):

mpiexec -np 4 python -m train_procgen.train --env_name ENV_NAME --distribution_mode hard
python -m train_procgen.graph --distribution_mode hard

Sample efficiency on easy environments (results/easy-all-runN):

python -m train_procgen.train --env_name ENV_NAME --distribution_mode easy
python -m train_procgen.graph --distribution_mode easy

Generalization on hard environments using 500 training levels (results/hard-500-runN):

mpiexec -np 8 python -m train_procgen.train --env_name ENV_NAME --num_levels 500 --distribution_mode hard --test_worker_interval 2
python -m train_procgen.graph --distribution_mode hard --restrict_training_set

Generalization on easy environments using 200 training levels (results/easy-200-runN):

mpiexec -np 2 python -m train_procgen.train --env_name ENV_NAME --num_levels 200 --distribution_mode easy --test_worker_interval 2
python -m train_procgen.graph --distribution_mode easy --restrict_training_set

Pass --normalize_and_reduce to compute and visualize the mean normalized return with train_procgen.graph.

Citation

Please cite using the following bibtex entry:

@article{cobbe2019procgen,
  title={Leveraging Procedural Generation to Benchmark Reinforcement Learning},
  author={Cobbe, Karl and Hesse, Christopher and Hilton, Jacob and Schulman, John},
  journal={arXiv preprint arXiv:1912.01588},
  year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].