All Projects → google-research → Seed_rl

google-research / Seed_rl

Licence: apache-2.0
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Seed rl

Magic Modules
Automatically generate Google Cloud Platform support for OSS IaaC Projects
Stars: ✭ 358 (-36.52%)
Mutual labels:  gcp
Cloud Functions Go
Unofficial Native Go Runtime for Google Cloud Functions
Stars: ✭ 427 (-24.29%)
Mutual labels:  gcp
Firebase Gcp Examples
🔥 Firebase app architectures, languages, tools & some GCP things! React w Next.js, Svelte w Sapper, Cloud Functions, Cloud Run.
Stars: ✭ 470 (-16.67%)
Mutual labels:  gcp
Rump
Hot sync two Redis servers using dumps.
Stars: ✭ 382 (-32.27%)
Mutual labels:  gcp
External Dns
Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Stars: ✭ 4,749 (+742.02%)
Mutual labels:  gcp
Terracognita
Reads from existing Cloud Providers (reverse Terraform) and generates your infrastructure as code on Terraform configuration
Stars: ✭ 452 (-19.86%)
Mutual labels:  gcp
Rpc Websockets
JSON-RPC 2.0 implementation over WebSockets for Node.js and JavaScript/TypeScript
Stars: ✭ 344 (-39.01%)
Mutual labels:  gcp
Rosettastone
Hearthstone simulator using C++ with some reinforcement learning
Stars: ✭ 510 (-9.57%)
Mutual labels:  rl
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (-28.01%)
Mutual labels:  gcp
React Firebase Starter
Boilerplate (seed) project for creating web apps with React.js, GraphQL.js and Relay
Stars: ✭ 4,366 (+674.11%)
Mutual labels:  gcp
Terratag
Terratag is a CLI tool that enables users of Terraform to automatically create and maintain tags across their entire set of AWS, Azure, and GCP resources
Stars: ✭ 385 (-31.74%)
Mutual labels:  gcp
Vue Crud X
Vue+Express Cookbook & CRUD Component (with Vite and Web Components)
Stars: ✭ 393 (-30.32%)
Mutual labels:  gcp
Gbt
Highly configurable prompt builder for Bash, ZSH and PowerShell written in Go.
Stars: ✭ 457 (-18.97%)
Mutual labels:  gcp
Gifee
Google's Infrastructure for Everyone Else
Stars: ✭ 370 (-34.4%)
Mutual labels:  gcp
Porter
Kubernetes powered PaaS that runs in your own cloud.
Stars: ✭ 498 (-11.7%)
Mutual labels:  gcp
Cloud Custodian
Rules engine for cloud security, cost optimization, and governance, DSL in yaml for policies to query, filter, and take actions on resources
Stars: ✭ 3,926 (+596.1%)
Mutual labels:  gcp
Mushroom Rl
Python library for Reinforcement Learning.
Stars: ✭ 442 (-21.63%)
Mutual labels:  rl
Awesome Cloudrun
👓 ⏩ A curated list of resources about all things Cloud Run
Stars: ✭ 521 (-7.62%)
Mutual labels:  gcp
Click To Deploy
Source for Google Click to Deploy solutions listed on Google Cloud Marketplace.
Stars: ✭ 509 (-9.75%)
Mutual labels:  gcp
Terraformer
CLI tool to generate terraform files from existing infrastructure (reverse Terraform). Infrastructure to Code
Stars: ✭ 6,316 (+1019.86%)
Mutual labels:  gcp

SEED

This repository contains an implementation of distributed reinforcement learning agent where both training and inference are performed on the learner.

Architecture

Four agents are implemented:

The code is already interfaced with the following environments:

However, any reinforcement learning environment using the gym API can be used.

For a detailed description of the architecture please read our paper. Please cite the paper if you use the code from this repository in your work.

Bibtex

@article{espeholt2019seed,
    title={SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference},
    author={Lasse Espeholt and Rapha{\"e}l Marinier and Piotr Stanczyk and Ke Wang and Marcin Michalski},
    year={2019},
    eprint={1910.06591},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Pull Requests

At this time, we do not accept pull requests. We are happy to link to forks that add interesting functionality.

Prerequisites

There are a few steps you need to take before playing with SEED. Instructions below assume you run the Ubuntu distribution.

apt-get install git
  • Clone SEED git repository:
git clone https://github.com/google-research/seed_rl.git
cd seed_rl

Local Machine Training on a Single Level

To easily start with SEED we provide a way of running it on a local machine. You just need to run one of the following commands (adjusting number of actors and number of envs. per actor Env. batch size` to your machine):

./run_local.sh [Game] [Agent] [number of actors] [number of envs. per actor]
./run_local.sh atari r2d2 4 4
./run_local.sh football vtrace 4 1
./run_local.sh dmlab vtrace 4 4
./run_local.sh mujoco ppo 4 32 --gin_config=/seed_rl/mujoco/gin/ppo.gin

It will build a Docker image using SEED source code and start the training inside the Docker image. Note that hyper parameters are not tuned in the runs above. Tensorboard is started as part of the training. It can be viewed under http://localhost:6006 by default.

We also provide a sample script for running training with tuned parameters for HalfCheetah-v2. This setup runs training with 8x32=256 parallel environments to make training faster. The sample complexity can be improved at the cost of slower training by running fewer environments and increasing the unroll_length parameter.

./mujoco/local_baseline_HalfCheetah-v2.sh

Distributed Training using AI Platform

Note that training with AI Platform results in charges for using compute resources.

The first step is to configure GCP and a Cloud project you will use for training:

gcloud auth login
gcloud config set project [YOUR_PROJECT]

Then you just need to execute one of the provided scenarios:

gcp/train_[scenario_name].sh

This will build the Docker image, push it to the repository which AI Platform can access and start the training process on the Cloud. Follow output of the command for progress. You can also view the running training jobs at https://console.cloud.google.com/ml/jobs

DeepMind Lab Level Cache

By default majority of DeepMind Lab's CPU usage is generated by creating new scenarios. This cost can be eliminated by enabling level cache. To enable it, set the level_cache_dir flag in the dmlab/config.py. As there are many unique episodes it is a good idea to share the same cache across multiple experiments. For AI Platform you can add --level_cache_dir=gs://${BUCKET_NAME}/dmlab_cache to the list of parameters passed in gcp/submit.sh to the experiment.

Baseline data on ATARI-57

We provide baseline training data for SEED's R2D2 trained on ATARI games in the form of training curves (checkpoints and Tensorboard event files coming soon). We provide data for 4 independent seeds run up to 40e9 environment frames.

The hyperparameters and evaluation procedure are the same as in section A.3.1 in the paper.

Training curves

Training curves are available on this page.

Checkpoints and Tensorboard event files

Checkpoints and tensorboard event files can be downloaded individually here or as a single (70GBs) zip file.

Additional links

SEED was used as a core infrastructure piece for the What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study paper. A colab that reproduces plots from the paper can be found here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].