Unity-Technologies / Q-GridWorld

Licence: other

Demo project using tabular Q-learning algorithm

Programming Languages

18002 projects

Projects that are alternatives of or similar to Q-GridWorld

Deep-Reinforcement-Learning-With-Python

Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math

Stars: ✭ 222 (+80.49%)

Mutual labels: q-learning

tictactoe-reinforcement-learning

Train a tic-tac-toe agent using reinforcement learning.

Stars: ✭ 36 (-70.73%)

Mutual labels: q-learning

Chrome-Dino-Reinforcement-Learning

An RL implementation in Keras

Stars: ✭ 98 (-20.33%)

Mutual labels: q-learning

Flow-Shop-Scheduling-Based-On-Reinforcement-Learning-Algorithm

Operations Research Application Project - Flow Shop Scheduling Based On Reinforcement Learning Algorithm

Stars: ✭ 73 (-40.65%)

Mutual labels: q-learning

Implicit-Q-Learning

PyTorch implementation of the implicit Q-learning algorithm (IQL)

Stars: ✭ 27 (-78.05%)

Mutual labels: q-learning

marley

A framework for multi-agent reinforcement learning.

Stars: ✭ 261 (+112.2%)

Mutual labels: q-learning

Grid royale

A life simulation for exploring social dynamics

Stars: ✭ 252 (+104.88%)

Mutual labels: q-learning

natural-gradient-deep-q-learning

arxiv.org/abs/1803.07482

Stars: ✭ 21 (-82.93%)

Mutual labels: q-learning

mentalRL

Code for our AAMAS 2020 paper: "A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry".

Stars: ✭ 22 (-82.11%)

Mutual labels: q-learning

Reinforcement Learning Demos

Stars: ✭ 66 (-46.34%)

Mutual labels: q-learning

LearnSnake

🐍 AI that learns to play Snake using Q-Learning (Reinforcement Learning)

Stars: ✭ 69 (-43.9%)

Mutual labels: q-learning

DRL in CV

A course on Deep Reinforcement Learning in Computer Vision. Visit Website:

Stars: ✭ 59 (-52.03%)

Mutual labels: q-learning

java-reinforcement-learning

Package provides java implementation of reinforcement learning algorithms such Q-Learn, R-Learn, SARSA, Actor-Critic

Stars: ✭ 90 (-26.83%)

Mutual labels: q-learning

pacman-ai

A.I. plays the original 1980 Pacman using Neuroevolution of Augmenting Topologies and Deep Q Learning

Stars: ✭ 26 (-78.86%)

Mutual labels: q-learning

flow

High frequency AI based algorithmic trading module.

Stars: ✭ 57 (-53.66%)

Mutual labels: q-learning

king-pong

Deep Reinforcement Learning Pong Agent, King Pong, he's the best

Stars: ✭ 23 (-81.3%)

Mutual labels: q-learning

Warehouse Robot Path Planning

A multi agent path planning solution under a warehouse scenario using Q learning and transfer learning.🤖️

Stars: ✭ 59 (-52.03%)

Mutual labels: q-learning

Paddle-RLBooks

Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.

Stars: ✭ 113 (-8.13%)

Mutual labels: q-learning

Explorer

Explorer is a PyTorch reinforcement learning framework for exploring new ideas.

Stars: ✭ 54 (-56.1%)

Mutual labels: q-learning

VREP-RL-bot

Reinforcement Learning in Vrep

Stars: ✭ 14 (-88.62%)

Mutual labels: q-learning

View All Similar Projects ➔

Q-GridWorld Demo

Simple Unity project demonstrating the Q-learning algorithm in a tabular setting. For an in-browser WebGL version, follow the link here.

Overview

In the simplest scenario, we have a 5x5 grid world with an agent (blue block), a goal (green block) and obstacles (red blocks). For each run of the demo, the positions of the agent, goal and obstacles are all selected at random (but remain consistent throughout the same demo run). In this grid world setting, the goal of the agent is to learn a strategy to navigate from its start position to the goal position efficiently while avoiding obstacles. It achieves this by learning the best action to take for every state it is in (typically called a policy in reinforcement learning). An action here is a direction to move (north, south, east and west), while a state here is its position in the grid world. It essentially learns the shortest, obstacle-free path from its start position to the goal position.

The Q-learning algorithm implemented here learns the policy by maintaining a numeric value for each action-state pair that represents how favourable it is to take a specific action when in a specific state. This numeric value for every action-state pair is incrementally updated as the agent explores the grid world. Intuitively, it performs several trials, where a trial is a series of actions that either end at an obstacle or the goal position. Then for each action-state pair it performed throughout the trial it increments its value if it were a positive trial (ended in the goal position), and decrements its value if it were a negative trial (collided with an obstacle). The agent is also given a small negative reward for each step it takes to encourage it to discover the shortest path.

Similar to our earlier multi-armed bandit demo, there is an exploration-exploitation trade-off here. When the agent is running through a trial it mixes between randomly picking actions and following its current guess of the best action to take when in a given state. This trade-off is controlled by an epsilon parameter that begins at 1 (encouraging full exploration) and throughout the trials is slowly decremented to 0.1 (limiting exploration). Consequently, as you run the demo you will notice that as the demo goes on, and the epsilon value declines, the agents actions become more and more predictable, converging to the optimal path from its start position to the goal position.

For more information on how the agent learns the strategy, check out our corresponding post in the Unity AI blog.

Beyond this demo, check out our Unity ML Agents repo which contains an SDK for applying more advanced methods to training behaviors within Unity.

In-game Settings

The goal of this Unity project is to provide an informative visualization for the Q-learning algorithm, enabling you to explore several grid sizes: small (5x5), medium (10x10) and large (15x15). Once you run the project, a demo will automatically start for the medium grid size, but you can change the grid size and click the Start New Environment button at any time.

Set-up

To get started with this project:

Download and install Unity if you don't already have it.
Download or clone this GitHub repository.
Open the game.unity file under the Assets/ subdirectory.

Within the project:

InternalAgent.cs contains all of the Q-learning logic.
GridEnvironment.cs contains all of the environment-specific logic.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Unity-Technologies / Q-GridWorld

Programming Languages

Labels

Projects that are alternatives of or similar to Q-GridWorld

Q-GridWorld Demo

Overview

In-game Settings

Set-up