All Projects β†’ italohdc β†’ LearnSnake

italohdc / LearnSnake

Licence: MIT license
🐍 AI that learns to play Snake using Q-Learning (Reinforcement Learning)

Programming Languages

javascript
184084 projects - #8 most used programming language
CSS
56736 projects
HTML
75241 projects

Projects that are alternatives of or similar to LearnSnake

relearn
A Reinforcement Learning Library for C++11/14
Stars: ✭ 22 (-68.12%)
Mutual labels:  q-learning, markov-decision-processes
cs7641-assignment4
CS7641 - Machine Learning - Assignment 4 - Markov Decision Processes
Stars: ✭ 14 (-79.71%)
Mutual labels:  q-learning, markov-decision-processes
Deep Qlearning Agent For Traffic Signal Control
A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.
Stars: ✭ 136 (+97.1%)
Mutual labels:  q-learning
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+221.74%)
Mutual labels:  q-learning
Learningx
Deep & Classical Reinforcement Learning + Machine Learning Examples in Python
Stars: ✭ 241 (+249.28%)
Mutual labels:  q-learning
Accel Brain Code
The purpose of this repository is to make prototypes as case study in the context of proof of concept(PoC) and research and development(R&D) that I have written in my website. The main research topics are Auto-Encoders in relation to the representation learning, the statistical machine learning for energy-based models, adversarial generation networks(GANs), Deep Reinforcement Learning such as Deep Q-Networks, semi-supervised learning, and neural network language model for natural language processing.
Stars: ✭ 166 (+140.58%)
Mutual labels:  q-learning
pomdp-py
A framework to build and solve POMDP problems. Documentation: https://h2r.github.io/pomdp-py/
Stars: ✭ 131 (+89.86%)
Mutual labels:  markov-decision-processes
Ctc Executioner
Master Thesis: Limit order placement with Reinforcement Learning
Stars: ✭ 112 (+62.32%)
Mutual labels:  q-learning
Python.io
Snake game inspired from Slither.io but features a python instead of a snake. Made in Python 3
Stars: ✭ 15 (-78.26%)
Mutual labels:  snake-game
Data Science Free
Free Resources For Data Science created by Shubham Kumar
Stars: ✭ 232 (+236.23%)
Mutual labels:  q-learning
Console-Snake
Snake Game in Console Implemented Using C++
Stars: ✭ 99 (+43.48%)
Mutual labels:  snake-game
Deep Rl Trading
playing idealized trading games with deep reinforcement learning
Stars: ✭ 228 (+230.43%)
Mutual labels:  q-learning
2048 Deep Reinforcement Learning
Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning
Stars: ✭ 169 (+144.93%)
Mutual labels:  q-learning
king-pong
Deep Reinforcement Learning Pong Agent, King Pong, he's the best
Stars: ✭ 23 (-66.67%)
Mutual labels:  q-learning
Gym Fx
Forex trading simulator environment for OpenAI Gym, observations contain the order status, performance and timeseries loaded from a CSV file containing rates and indicators. Work In Progress
Stars: ✭ 151 (+118.84%)
Mutual labels:  q-learning
pacman-ai
A.I. plays the original 1980 Pacman using Neuroevolution of Augmenting Topologies and Deep Q Learning
Stars: ✭ 26 (-62.32%)
Mutual labels:  q-learning
Reinforcement learning in python
Implementing Reinforcement Learning, namely Q-learning and Sarsa algorithms, for global path planning of mobile robot in unknown environment with obstacles. Comparison analysis of Q-learning and Sarsa
Stars: ✭ 134 (+94.2%)
Mutual labels:  q-learning
Rl trading
An environment to high-frequency trading agents under reinforcement learning
Stars: ✭ 205 (+197.1%)
Mutual labels:  q-learning
Grid royale
A life simulation for exploring social dynamics
Stars: ✭ 252 (+265.22%)
Mutual labels:  q-learning
Flow-Shop-Scheduling-Based-On-Reinforcement-Learning-Algorithm
Operations Research Application Project - Flow Shop Scheduling Based On Reinforcement Learning Algorithm
Stars: ✭ 73 (+5.8%)
Mutual labels:  q-learning

LearnSnake: italolelis.com/snake

This is a implementation of an Artificial Intelligence fully written in Javascript that learns to play Snake using Reinforcement Learning (Q-Learning).

Snake autonomously trying to catch the fruit...

It runs directly in the browser. You don't need to setup an environment and no installation is necessary.

See it working and tweak some variables: Try it yourself !

How it works

The first question I was asked when I came up with this idea: "Why do you want to use Javascript instead of Python?". I know Python has some libraries for scientific computing (like NumPy and TensorFlow), but I wanted something that worked right out of the box: no installation, running directly in the browser. This way, anyone can test and play with it without worrying about setting up a proper environment.

Snake (the game itself)

Snake is a game in which a snake needs to explore an environment and catch the fruit without hitting any obstacle or itself. Every time the snake catches a fruit, its size increases.

I could have forked another Snake repository, but since I didn't know Javascript (and I would need to use it on the next steps), I thought that developing the Snake game from scratch would be a good idea to learn more about it.

To develop this part of the project, I used these contents for guidance:

See game.js for the source code.

Q-Learning

Accordingly to Christopher Watkins, "Q-Learning is a simple way for agents to learn how to act optimally in controlled Markovian domains" (Watkins, 1989). In simple terms, it uses a MDP (Markov Decision Process) to control and make decisions in an environment. It consists of a Q-Table that is constantly updated.

A Q-Table is a matrix with a set of states and respective action's probability of success. When the agent explore the environment, the table is updated. The action with the biggest value is considered the best action to make.

Example of a QTable (the best action is highlighted)

Example of a QTable (the best action is highlighted)

In this guide I will explain how I applied Q-Learning in the Snake game. If you want to understand more deeply (yet in a simple way) about Q-Learning and Reinforcement Learning, I suggest this Medium post by Vishal Maini.

See q-learning.js for the source code.

Algorithm

The algorithm consists of:

function explanation
s = game.state() get state s
act = best-action( Q(s) ) execute best action act given state s
rew = game.reward( ) receive reward rew
s' = game.state() get next state s'
Q(s, act) = update-qTable( ) update Q-Table value q of state s and action act

Update Q-Table

The new Q-Table value for the action accomplished is given by this formula (taken from the article by Vishal Maini):

It's executed after the action is taken and the reward is known.

Actions

Available actions are "UP", "DOWN", "LEFT" and "RIGHT", simulating a user interaction with the game.

↑ ↓ ← β†’
UP DOWN LEFT RIGHT

The best action is selected by choosing the biggest action value in a certain state in the Q-Table. If the maximum value equals 0, a random action is selected.

Reward

The only reward is given when the snake grabs the fruit ( +1 ). I tried also giving a small reward (approximately +0.1) when it successfully explored the environment without dying, but the result was that the snake only moved on the environment in circles without really caring about the fruit.

The penalty happens whenever the game resets ( -1 ), that is, the snake hits its tail or a wall.

If anything else happens, there's a minor negative reward ( -0.1 ). That's ideal to minimize the path taken to catch the fruit and makes the training process faster.

Action Reward
Catch the Fruit + 1
Hits tail - 1
Hits wall - 1
Else - 0.1

States

First I tried creating a dictionary of states based on all the snake positions and trail formats. Although it worked, as a result of the high number of states, it was necessary to let the code train for a very long time. Since this project was intended to give a fast explanation and to show a fast result to the user (and the results are not saved across sessions), this way to save states in the dictionary was not the best approach.

To work around this limitation, the dictionary of states is based only in the relative position of the fruit to the head of the snake and the relative position of the last tail section to the head of the snake.

In this way, the dictionary of states stores the name like this:

The Q-Table is stored using a Javascript Object and looks like this:

Example of a QTable in the code

Example of a QTable in the code


If you want to test this algorithm with a full set of states, you can clone the project on GitHub and change a few lines to see how it behaves.

Interesting Links

  • Q-Learning-Python-Example by Hasan Iqbal: Implementation of Q-Learning for the game "Catch the Ball", which I used to understand the algorithm steps.

  • IAMDinosaur by Ivan Seidel: Artificial Inteligence that teachs Google's Dinosaur to jump cactus. Not directly related to this project, but an awesome project!

License

This project is distributed under the MIT License.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].