All Projects → lexfridman → Deeptraffic

lexfridman / Deeptraffic

Licence: mit
DeepTraffic is a deep reinforcement learning competition, part of the MIT Deep Learning series.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Deeptraffic

Mit Deep Learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
Stars: ✭ 8,912 (+483.25%)
Mutual labels:  mit, deep-reinforcement-learning, self-driving-cars, deep-rl
Object-Goal-Navigation
Pytorch code for NeurIPS-20 Paper "Object Goal Navigation using Goal-Oriented Semantic Exploration"
Stars: ✭ 107 (-93%)
Mutual labels:  deep-reinforcement-learning, deep-rl
Introtodeeplearning
Lab Materials for MIT 6.S191: Introduction to Deep Learning
Stars: ✭ 4,955 (+224.28%)
Mutual labels:  mit, deep-reinforcement-learning
Cs234 Reinforcement Learning Winter 2019
My Solutions of Assignments of CS234: Reinforcement Learning Winter 2019
Stars: ✭ 93 (-93.91%)
Mutual labels:  deep-reinforcement-learning
Awesome Deep Reinforcement Learning
Curated list for Deep Reinforcement Learning (DRL): software frameworks, models, datasets, gyms, baselines...
Stars: ✭ 95 (-93.78%)
Mutual labels:  deep-reinforcement-learning
Top Deep Learning
Top 200 deep learning Github repositories sorted by the number of stars.
Stars: ✭ 1,365 (-10.67%)
Mutual labels:  deep-reinforcement-learning
Easy Rl
强化学习中文教程,在线阅读地址:https://datawhalechina.github.io/easy-rl/
Stars: ✭ 3,004 (+96.6%)
Mutual labels:  deep-reinforcement-learning
Cs234
My Solution to Assignments of CS234
Stars: ✭ 91 (-94.04%)
Mutual labels:  deep-reinforcement-learning
Reinforcement Learning
🤖 Implements of Reinforcement Learning algorithms.
Stars: ✭ 104 (-93.19%)
Mutual labels:  deep-reinforcement-learning
Exploration By Disagreement
[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement
Stars: ✭ 99 (-93.52%)
Mutual labels:  deep-reinforcement-learning
Samsung Drl Code
Repository for codes of Deep Reinforcement Learning (DRL) lectured at Samsung
Stars: ✭ 99 (-93.52%)
Mutual labels:  deep-reinforcement-learning
Cloudcmd
✨☁️📁✨ Cloud Commander file manager for the web with console and editor.
Stars: ✭ 1,332 (-12.83%)
Mutual labels:  mit
Deep Reinforcement Learning Notes
Deep Reinforcement Learning Notes
Stars: ✭ 101 (-93.39%)
Mutual labels:  deep-reinforcement-learning
Drl Rec
Deep reinforcement learning for recommendation system
Stars: ✭ 92 (-93.98%)
Mutual labels:  deep-reinforcement-learning
Macad Gym
Multi-Agent Connected Autonomous Driving (MACAD) Gym environments for Deep RL. Code for the paper presented in the Machine Learning for Autonomous Driving Workshop at NeurIPS 2019:
Stars: ✭ 106 (-93.06%)
Mutual labels:  deep-reinforcement-learning
Pytorch sac ae
PyTorch implementation of Soft Actor-Critic + Autoencoder(SAC+AE)
Stars: ✭ 94 (-93.85%)
Mutual labels:  deep-reinforcement-learning
Intro To Deep Learning
A collection of materials to help you learn about deep learning
Stars: ✭ 103 (-93.26%)
Mutual labels:  deep-reinforcement-learning
Deep Reinforcement Learning With Pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Stars: ✭ 1,345 (-11.98%)
Mutual labels:  deep-reinforcement-learning
Deeprl algorithms
DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)
Stars: ✭ 97 (-93.65%)
Mutual labels:  deep-reinforcement-learning
Mit Deep Learning Book Pdf
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
Stars: ✭ 9,859 (+545.22%)
Mutual labels:  mit

DeepTraffic: MIT Deep Reinforcement Learning Competition

DeepTraffic - Visualization - Leaderboard - Documentation - Paper - MIT Deep Learning [ GitHub | Website ]

DeepTraffic is a deep reinforcement learning competition hosted as part of the MIT Deep Learning courses. The goal is to create a neural network that drives a vehicle (or multiple vehicles) as fast as possible through dense highway traffic. Top 10 submissions are listed on the leaderboard and you'll be able to visualize your submission in the following way:

DeepTraffic visualization

If you find the work useful in your research, please cite the DeepTraffic paper:

@inproceedings{fridman2018deeptraffic,
author = {Lex Fridman and Jack Terwilliger and Benedikt Jenik},
title = {DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation},
booktitle = {Neural Information Processing Systems (NIPS 2018) Deep Reinforcement Learning Workshop}
year = {2018},
url = {http://arxiv.org/abs/1801.02805},
doi = {10.5281/zenodo.2530457}
archivePrefix = {arXiv},
}

To get started right away, this repository provides a code snippet to insert into the code box on the DeepTraffic site. We'll add additional agents as the course progresses:

network_basic.js: A basic network that achieves ~66.8mph.

And now let's return to the problem of traffic:

Problem Statement: Traffic is Terrible

"Americans will put up with anything provided it doesn’t block traffic." - Dan Rather

"Traffic is soul-destroying." - Elon Musk

In the U.S. alone, we spend 6.9 billion hours sitting in traffic each year [1] — roughly 10,000 human lifetimes [2]. Autonomous vehicles will be able to alleviate part (but not all) of the problem. Already, they show promise in reducing phantom traffic jams [3,4].

We’ve designed DeepTraffic to let people (from beginners to experts) explore the design of motion planning algorithms for autonomous vehicles and to inspire the next generation of traffic engineering. We thank the thousands of competitors who have submitted solutions and are actively participating.

DeepTraffic Layout

DeepTraffic

The game page consists of four different areas:

  • On the left, you can find a real time simulation of the road with different display options.

  • On the upper half of the page, you can find (1) a coding area where you can change the design of the neural network which controls the agents and (2) some buttons for applying your changes, saving/loading, and making a submission.

  • Below the coding area, you can find (1) a graph showing a moving average of the center red car’s reward, (2) a visualization of the neural network activations, and (3) buttons for training and testing your network.

  • Between the simulated roadway and the graphs, you can find the current image of you vehicle and some options to customize it and create a visualization of your best submission.

The simulation area shows some basic information like the current speed of the car and the number of cars that have been passed since you opened the site. It also allows you to change the way the simulation is displayed. Display selection

DeepTraffic Simulation & Game

In short, DeepTraffic is a game in which you (a competitor) design your own motion planning algorithm in order to drive a vehicle as fast as possible through dense traffic.

Your algorithm will operate on a 7 lane highway. There are 20 vehicles on the road. Your algorithm controls some vehicles. The game controls the others.

Each autonomous agent runs a copy of your algorithm. Every 30 frames, your algorithm selects 1 of 5 actions:

  1. accelerate
  2. decelerate
  3. change into the left lane
  4. change in to the right lane
  5. do nothing, i.e. maintain speed in present lane.

Your algorithm will receive, as input, an occupancy grid, representing the free space around the agent. The value of unoccupied cells is set to 80mph. The value of occupied cells is set to the speed of the occupying vehicle. For example here's an occupancy grid (lanesSide = 1; patchesAhead=10):

learning input

There are a few quirks to DeepTraffic’s dynamics:

Safety System

Each vehicle has a safety system which prevents it from colliding with other vehicles. This has 2 implications for how you will design your algorithm. First, your algorithm does not need to consider collision avoidance. Second, your path will be overridden when the safety system is activated.

For example, here, the red car cannot accelerate or change into the right lane, because the collision avoidance system has detected vehicles in the way:

safety system

A vehicle that is 4 cells behind another will immediately slow to match the lead vehicle, regardless of what its algorithm tries to do. (see diagram above)

A vehicle driving beside another will be unable to change into its neighbor’s lane until there is a sufficient gap, regardless of what its algorithm tries to do. (see diagram above)

Multiple Agents

In version 2.0, the current version, you have the option to deploy a copy of your algorithm on 11 vehicles. You algorithm won’t do multi-agent planning, rather each vehicle makes a greedy choice. The challenge is to design an algorithm which does not get in its own way when controlling several vehicles.

Where the Highway Ends

DeepTraffic follows just one of the vehicles (the ego vehicle), so you’ll notice some of the vehicles fall off the highway when they drive slower or faster than the ego vehicle. What happens to these vehicles?

When vehicles fall off the road, they are replaced by new vehicles on the opposite end of the highway. When a vehicle is replaced, its speed & lane is chosen randomly.

Hyperparameters

To do well in DeepTraffic using DQN, you’ll have to choose good hyperparameters. This can be tricky because (1) the full hyperparameter space is rather large and (2) the bigger your network gets, the longer it takes to train which means you’ll explore less of the hyper-parameter space. Therefore, it helps to understand how changing the hyper-parameters will change performance prior to training.

parameters

Results

Progress

The plot below shows how the competition progressed over time:

progress

The Structure of Submissions

Below is a t-SNE plot, i.e. submissions originally represented in a vector space spanning patchesAhead, patchesBehind, l2_decay, layer_count, gamma, learning_rate, lanesSide, train_iterations are plotted in a 2 dimensional space which preserves the composition of neighboring points. The color of each dot corresponds to submissions score. An interesting feature of this plot is that several clusters emerge — competitors stumbled upon similar solutions.

tsne

Help and Documentation

See Documentation page for more details and hints and how to submit to the competition.

Team

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].