Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Stars: ✭ 658 (-2.37%)

Mutual labels: reinforcement-learning

Hands On Reinforcement Learning With Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

Stars: ✭ 640 (-5.04%)

Mutual labels: reinforcement-learning

Chinesechess Alphazero

Implement AlphaZero/AlphaGo Zero methods on Chinese chess.

Stars: ✭ 616 (-8.61%)

Mutual labels: reinforcement-learning

Amazon Sagemaker Examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

Stars: ✭ 6,346 (+841.54%)

Mutual labels: reinforcement-learning

Reversi Alpha Zero

Reversi reinforcement learning by AlphaGo Zero methods.

Stars: ✭ 598 (-11.28%)

Mutual labels: reinforcement-learning

Deepdrive

Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving

Stars: ✭ 628 (-6.82%)

Mutual labels: reinforcement-learning

Habitat Lab

A modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.

Stars: ✭ 587 (-12.91%)

Mutual labels: reinforcement-learning

Super Mario Bros Ppo Pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros

Stars: ✭ 649 (-3.71%)

Mutual labels: reinforcement-learning

Fast abs rl

Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"

Stars: ✭ 569 (-15.58%)

Mutual labels: reinforcement-learning

Apollo Platform

Collections of Apollo Platform Software

Stars: ✭ 611 (-9.35%)

Mutual labels: autonomous-driving

David Silver Reinforcement Learning

Notes for the Reinforcement Learning course by David Silver along with implementation of various algorithms.

Stars: ✭ 623 (-7.57%)

Mutual labels: reinforcement-learning

Gibsonenv

Gibson Environments: Real-World Perception for Embodied Agents

Stars: ✭ 666 (-1.19%)

Mutual labels: reinforcement-learning

Dl Nlp Readings

My Reading Lists of Deep Learning and Natural Language Processing

Stars: ✭ 656 (-2.67%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

highway-env

A collection of environments for autonomous driving and tactical decision-making tasks

An episode of one of the environments available in highway-env.

Try it on Google Colab!

The environments

Highway

env = gym.make("highway-v0")

In this task, the ego-vehicle is driving on a multilane highway populated with other vehicles. The agent's objective is to reach a high speed while avoiding collisions with neighbouring vehicles. Driving on the right side of the road is also rewarded.

The highway-v0 environment.

Merge

env = gym.make("merge-v0")

In this task, the ego-vehicle starts on a main highway but soon approaches a road junction with incoming vehicles on the access ramp. The agent's objective is now to maintain a high speed while making room for the vehicles so that they can safely merge in the traffic.

The merge-v0 environment.

Roundabout

env = gym.make("roundabout-v0")

In this task, the ego-vehicle if approaching a roundabout with flowing traffic. It will follow its planned route automatically, but has to handle lane changes and longitudinal control to pass the roundabout as fast as possible while avoiding collisions.

The roundabout-v0 environment.

Parking

env = gym.make("parking-v0")

A goal-conditioned continuous control task in which the ego-vehicle must park in a given space with the appropriate heading.

The parking-v0 environment.

Intersection

env = gym.make("intersection-v0")

An intersection negotiation task with dense traffic.

The intersection-v0 environment.

Examples of agents

Agents solving the highway-env environments are available in the rl-agents and stable-baselines repositories.

pip install --user git+https://github.com/eleurent/rl-agents

Deep Q-Network

The DQN agent solving highway-v0.

This model-free value-based reinforcement learning agent performs Q-learning with function approximation, using a neural network to represent the state-action value function Q.

Deep Deterministic Policy Gradient

The DDPG agent solving parking-v0.

This model-free policy-based reinforcement learning agent is optimized directly by gradient ascent. It uses Hindsight Experience Replay to efficiently learn how to solve a goal-conditioned task.

Value Iteration

The Value Iteration agent solving highway-v0.

The Value Iteration is only compatible with finite discrete MDPs, so the environment is first approximated by a finite-mdp environment using env.to_finite_mdp(). This simplified state representation describes the nearby traffic in terms of predicted Time-To-Collision (TTC) on each lane of the road. The transition model is simplistic and assumes that each vehicle will keep driving at a constant speed without changing lanes. This model bias can be a source of mistakes.

The agent then performs a Value Iteration to compute the corresponding optimal state-value function.

Monte-Carlo Tree Search

This agent leverages a transition and reward models to perform a stochastic tree search (Coulom, 2006) of the optimal trajectory. No particular assumption is required on the state representation or transition model.

The MCTS agent solving highway-v0.

Installation

pip install highway-env

Usage

import gym
import highway_env

env = gym.make("highway-v0")

done = False
while not done:
    action = ... # Your agent code here
    obs, reward, done, info = env.step(action)
    env.render()

Documentation

Read the documentation online.

Citing

If you use the project in your work, please consider citing it with:

@misc{highway-env,
  author = {Leurent, Edouard},
  title = {An Environment for Autonomous Driving Decision-Making},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/eleurent/highway-env}},
}

List of publications & preprints using highway-env (please open a pull request to add missing entries):

Approximate Robust Control of Uncertain Dynamical Systems (Dec 2018)
Interval Prediction for Continuous-Time Systems with Parametric Uncertainties (Apr 2019)
Practical Open-Loop Optimistic Planning (Apr 2019)
α^α-Rank: Practically Scaling α-Rank through Stochastic Optimisation (Sep 2019)
Social Attention for Autonomous Decision-Making in Dense Traffic (Nov 2019)
Budgeted Reinforcement Learning in Continuous State Space (Dec 2019)
Multi-View Reinforcement Learning (Dec 2019)
Reinforcement learning for Dialogue Systems optimization with user adaptation (Dec 2019)
Distributional Soft Actor Critic for Risk Sensitive Learning (Apr 2020)
Bi-Level Actor-Critic for Multi-Agent Coordination (Apr 2020)
Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes (Jun 2020)
Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities (Jul 2020)
Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems (Jul 2020)
SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction (Jul 2020)
B-GAP: Behavior-Guided Action Prediction for Autonomous Navigation (Nov 2020)
Assessing and Accelerating Coverage in Deep Reinforcement Learning (Dec 2020)
Distributionally Consistent Simulation of Naturalistic Driving Environment for Autonomous Vehicle Testing (Jan 2021)
Interpretable Policy Specification and Synthesis through Natural Language and RL (Jan 2021)
Corner Case Generation and Analysis for Safety Assessment of Autonomous Vehicles (Feb 2021)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 674

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (10) 🔗