All Projects → Unity-Technologies → Q-GridWorld

Unity-Technologies / Q-GridWorld

Licence: other
Demo project using tabular Q-learning algorithm

Programming Languages

C#
18002 projects

Projects that are alternatives of or similar to Q-GridWorld

Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+80.49%)
Mutual labels:  q-learning
tictactoe-reinforcement-learning
Train a tic-tac-toe agent using reinforcement learning.
Stars: ✭ 36 (-70.73%)
Mutual labels:  q-learning
Chrome-Dino-Reinforcement-Learning
An RL implementation in Keras
Stars: ✭ 98 (-20.33%)
Mutual labels:  q-learning
Flow-Shop-Scheduling-Based-On-Reinforcement-Learning-Algorithm
Operations Research Application Project - Flow Shop Scheduling Based On Reinforcement Learning Algorithm
Stars: ✭ 73 (-40.65%)
Mutual labels:  q-learning
Implicit-Q-Learning
PyTorch implementation of the implicit Q-learning algorithm (IQL)
Stars: ✭ 27 (-78.05%)
Mutual labels:  q-learning
marley
A framework for multi-agent reinforcement learning.
Stars: ✭ 261 (+112.2%)
Mutual labels:  q-learning
Grid royale
A life simulation for exploring social dynamics
Stars: ✭ 252 (+104.88%)
Mutual labels:  q-learning
natural-gradient-deep-q-learning
arxiv.org/abs/1803.07482
Stars: ✭ 21 (-82.93%)
Mutual labels:  q-learning
mentalRL
Code for our AAMAS 2020 paper: "A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry".
Stars: ✭ 22 (-82.11%)
Mutual labels:  q-learning
RL
Reinforcement Learning Demos
Stars: ✭ 66 (-46.34%)
Mutual labels:  q-learning
LearnSnake
🐍 AI that learns to play Snake using Q-Learning (Reinforcement Learning)
Stars: ✭ 69 (-43.9%)
Mutual labels:  q-learning
DRL in CV
A course on Deep Reinforcement Learning in Computer Vision. Visit Website:
Stars: ✭ 59 (-52.03%)
Mutual labels:  q-learning
java-reinforcement-learning
Package provides java implementation of reinforcement learning algorithms such Q-Learn, R-Learn, SARSA, Actor-Critic
Stars: ✭ 90 (-26.83%)
Mutual labels:  q-learning
pacman-ai
A.I. plays the original 1980 Pacman using Neuroevolution of Augmenting Topologies and Deep Q Learning
Stars: ✭ 26 (-78.86%)
Mutual labels:  q-learning
flow
High frequency AI based algorithmic trading module.
Stars: ✭ 57 (-53.66%)
Mutual labels:  q-learning
king-pong
Deep Reinforcement Learning Pong Agent, King Pong, he's the best
Stars: ✭ 23 (-81.3%)
Mutual labels:  q-learning
Warehouse Robot Path Planning
A multi agent path planning solution under a warehouse scenario using Q learning and transfer learning.🤖️
Stars: ✭ 59 (-52.03%)
Mutual labels:  q-learning
Paddle-RLBooks
Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.
Stars: ✭ 113 (-8.13%)
Mutual labels:  q-learning
Explorer
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
Stars: ✭ 54 (-56.1%)
Mutual labels:  q-learning
VREP-RL-bot
Reinforcement Learning in Vrep
Stars: ✭ 14 (-88.62%)
Mutual labels:  q-learning

alt text

Q-GridWorld Demo

Simple Unity project demonstrating the Q-learning algorithm in a tabular setting. For an in-browser WebGL version, follow the link here.

Overview

In the simplest scenario, we have a 5x5 grid world with an agent (blue block), a goal (green block) and obstacles (red blocks). For each run of the demo, the positions of the agent, goal and obstacles are all selected at random (but remain consistent throughout the same demo run). In this grid world setting, the goal of the agent is to learn a strategy to navigate from its start position to the goal position efficiently while avoiding obstacles. It achieves this by learning the best action to take for every state it is in (typically called a policy in reinforcement learning). An action here is a direction to move (north, south, east and west), while a state here is its position in the grid world. It essentially learns the shortest, obstacle-free path from its start position to the goal position.

The Q-learning algorithm implemented here learns the policy by maintaining a numeric value for each action-state pair that represents how favourable it is to take a specific action when in a specific state. This numeric value for every action-state pair is incrementally updated as the agent explores the grid world. Intuitively, it performs several trials, where a trial is a series of actions that either end at an obstacle or the goal position. Then for each action-state pair it performed throughout the trial it increments its value if it were a positive trial (ended in the goal position), and decrements its value if it were a negative trial (collided with an obstacle). The agent is also given a small negative reward for each step it takes to encourage it to discover the shortest path.

Similar to our earlier multi-armed bandit demo, there is an exploration-exploitation trade-off here. When the agent is running through a trial it mixes between randomly picking actions and following its current guess of the best action to take when in a given state. This trade-off is controlled by an epsilon parameter that begins at 1 (encouraging full exploration) and throughout the trials is slowly decremented to 0.1 (limiting exploration). Consequently, as you run the demo you will notice that as the demo goes on, and the epsilon value declines, the agents actions become more and more predictable, converging to the optimal path from its start position to the goal position.

For more information on how the agent learns the strategy, check out our corresponding post in the Unity AI blog.

Beyond this demo, check out our Unity ML Agents repo which contains an SDK for applying more advanced methods to training behaviors within Unity.

In-game Settings

The goal of this Unity project is to provide an informative visualization for the Q-learning algorithm, enabling you to explore several grid sizes: small (5x5), medium (10x10) and large (15x15). Once you run the project, a demo will automatically start for the medium grid size, but you can change the grid size and click the Start New Environment button at any time.

Set-up

To get started with this project:

  • Download and install Unity if you don't already have it.
  • Download or clone this GitHub repository.
  • Open the game.unity file under the Assets/ subdirectory.

Within the project:

  • InternalAgent.cs contains all of the Q-learning logic.
  • GridEnvironment.cs contains all of the environment-specific logic.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].