All Projects → mbchang → decentralized-rl

mbchang / decentralized-rl

Licence: MIT license
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to decentralized-rl

ml course
"Learning Machine Learning" Course, Bogotá, Colombia 2019 #LML2019
Stars: ✭ 22 (-45%)
Mutual labels:  deep-reinforcement-learning
awesome-machine-learning-robotics
A curated list of resources about Machine Learning for Robotics
Stars: ✭ 52 (+30%)
Mutual labels:  deep-reinforcement-learning
pytorch-noreward-rl
pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction
Stars: ✭ 79 (+97.5%)
Mutual labels:  deep-reinforcement-learning
deep rl acrobot
TensorFlow A2C to solve Acrobot, with synchronized parallel environments
Stars: ✭ 32 (-20%)
Mutual labels:  deep-reinforcement-learning
reinforcement learning ppo rnd
Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some explanation
Stars: ✭ 33 (-17.5%)
Mutual labels:  deep-reinforcement-learning
interp-e2e-driving
Interpretable End-to-end Urban Autonomous Driving with Latent Deep Reinforcement Learning
Stars: ✭ 159 (+297.5%)
Mutual labels:  deep-reinforcement-learning
deep-blueberry
If you've always wanted to learn about deep-learning but don't know where to start, then you might have stumbled upon the right place!
Stars: ✭ 17 (-57.5%)
Mutual labels:  deep-reinforcement-learning
Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020
Live Trading. Please star.
Stars: ✭ 1,251 (+3027.5%)
Mutual labels:  deep-reinforcement-learning
Super-Meta-MarIO
Mario AI Ensemble
Stars: ✭ 15 (-62.5%)
Mutual labels:  deep-reinforcement-learning
deeprl-continuous-control
Learning Continuous Control in Deep Reinforcement Learning
Stars: ✭ 14 (-65%)
Mutual labels:  deep-reinforcement-learning
DeepCubeA
Code for DeepCubeA, a Deep Reinforcement Learning algorithm that can learn to solve the Rubik's cube.
Stars: ✭ 92 (+130%)
Mutual labels:  deep-reinforcement-learning
DRL in CV
A course on Deep Reinforcement Learning in Computer Vision. Visit Website:
Stars: ✭ 59 (+47.5%)
Mutual labels:  deep-reinforcement-learning
Pytorch-PCGrad
Pytorch reimplementation for "Gradient Surgery for Multi-Task Learning"
Stars: ✭ 179 (+347.5%)
Mutual labels:  deep-reinforcement-learning
revisiting rainbow
Revisiting Rainbow
Stars: ✭ 71 (+77.5%)
Mutual labels:  deep-reinforcement-learning
motion-planner-reinforcement-learning
End to end motion planner using Deep Deterministic Policy Gradient (DDPG) in gazebo
Stars: ✭ 99 (+147.5%)
Mutual labels:  deep-reinforcement-learning
DI-star
An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.
Stars: ✭ 1,335 (+3237.5%)
Mutual labels:  deep-reinforcement-learning
alphastone
Using self-play, MCTS, and a deep neural network to create a hearthstone ai player
Stars: ✭ 24 (-40%)
Mutual labels:  deep-reinforcement-learning
pomdp-baselines
Simple (but often Strong) Baselines for POMDPs in PyTorch - ICML 2022
Stars: ✭ 162 (+305%)
Mutual labels:  deep-reinforcement-learning
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-17.5%)
Mutual labels:  deep-reinforcement-learning
Ramudroid
Ramudroid, autonomous solar-powered robot to clean roads, realtime object detection and webrtc based streaming
Stars: ✭ 22 (-45%)
Mutual labels:  deep-reinforcement-learning

Decentralized Reinforcement Learning

MIT license

This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions by Michael Chang, Sid Kaushik, Matt Weinberg, Tom Griffiths, and Sergey Levine, accepted to the International Conference on Machine Learning, 2020.

Check out the accompanying blog post.

sdm

Setup

Set the PYTHONPATH: export PYTHONPATH=$PWD.

Create a conda environment with python version 3.6.

Install dependencies: pip install -r requirements.txt. This should also install babyai==0.1.0 from https://github.com/sidk99/babyai.git and gym-minigrid==1.0.1.

For the TwoRooms environment, comment out

if self.step_count >= self.max_steps:
    done = True

in gym_minigrid/minigrid.py in your gym-minigrid installation. By handling time-outs on the algorithm side rather than the environment side, we can treat the environment as an infinite-horizon problem. Otherwise, we'd have to put the time-step into the state to preserve the Markov property.

For GPU, set OMP_NUM_THREADS to 1: export OMP_NUM_THREADS=1.

Training

Run python runner.py --<experiment-name> to print out example commands for the environments in the paper. Add the --for-real flag to run those commands. You can enable parallel data collection with the --parallel_collect flag. You can also specify the gpu ids. As examples, in runner.py, the methods that launch bandit, chain, and duality do not use gpu while the others use gpu 0.

For the TwoRooms environment, you would need to pre-train the subpolicies first. Then you would need to specify the expriment folders for training the society using the pre-trained primitives. Instructions are in run_tworooms_pretrain_task and run_tworooms_transfer_task of runner.py.

Visualization

You can view the training curves in <exp_folder>/<seed_folder>/group_0/<env-name>_train/quantitative and you can view visualizations (for environments that have image observations) in <exp_folder>/<seed_folder>/group_0/<env-name>_test/qualitative.

Credits

The PPO update is based on this repo.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].