All Projects → pawel-kieliszczyk → snake-reinforcement-learning

pawel-kieliszczyk / snake-reinforcement-learning

Licence: other
AI (A2C agent) mastering the game of Snake with TensorFlow 2.0

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to snake-reinforcement-learning

Snake
使用Snake,Android也可以轻松实现类iOS侧滑返回效果
Stars: ✭ 605 (+1535.14%)
Mutual labels:  snake
Slither.io Clone
Learn how to make Slither.io with JavaScript and Phaser! This game clones all the core features of Slither.io, including mouse-following controls, snake collisions, food, snake growth, eyes, and more. Progress through each part of the source code with our Slither.io tutorial series.
Stars: ✭ 168 (+354.05%)
Mutual labels:  snake
Wortuhr ESP8266
Wortuhr mit ESP8266 WeMos D1 mini und NeoPixel WS2812B LEDs mit mp3 Sounds, Animationen, Transitions, Events und Spiele
Stars: ✭ 33 (-10.81%)
Mutual labels:  snake
Team Snake
A Discord bot that lets you play Snake with your friends
Stars: ✭ 20 (-45.95%)
Mutual labels:  snake
Sharedfonttool
3DS SharedFontTool
Stars: ✭ 140 (+278.38%)
Mutual labels:  snake
Case
String case utitility: convert, identify, flip, extend
Stars: ✭ 237 (+540.54%)
Mutual labels:  snake
Snake
🚀thinkphp5.1 + layui 实现的带rbac的基础管理后台,方便快速开发法使用
Stars: ✭ 526 (+1321.62%)
Mutual labels:  snake
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (+5.41%)
Mutual labels:  a2c
Dotnet Console Games
Game examples implemented in .NET console applications primarily for educational purposes.
Stars: ✭ 157 (+324.32%)
Mutual labels:  snake
snake-server
Snake-Server is a pure Go implementation of the famous arcade game 🐍
Stars: ✭ 31 (-16.22%)
Mutual labels:  snake
Tastysnake
A two-player (Bluetooth) game on Android.
Stars: ✭ 61 (+64.86%)
Mutual labels:  snake
Snake
Artificial intelligence for the Snake game.
Stars: ✭ 1,241 (+3254.05%)
Mutual labels:  snake
3dstool
An all-in-one tool for extracting/creating 3ds roms.
Stars: ✭ 246 (+564.86%)
Mutual labels:  snake
Snake
🐍 一款小巧的基于Go构建的开发框架,可以快速构建API服务或者Web网站进行业务开发,遵循SOLID设计原则
Stars: ✭ 615 (+1562.16%)
Mutual labels:  snake
go-snake-telnet
Snake Game over telnet protocol in Go
Stars: ✭ 22 (-40.54%)
Mutual labels:  snake
Android Snake Menu
imitate Tumblr's menu, dragging animations look like a snake
Stars: ✭ 584 (+1478.38%)
Mutual labels:  snake
Snek
🐍 ‎ A terminal-based Snake implementation written in JavaScript.
Stars: ✭ 210 (+467.57%)
Mutual labels:  snake
Snake
用Java语言开发的AI贪吃蛇
Stars: ✭ 62 (+67.57%)
Mutual labels:  snake
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+500%)
Mutual labels:  a2c
Cpp-Snake
A simple snake game written in c++
Stars: ✭ 35 (-5.41%)
Mutual labels:  snake

SnakeAI

Training AI to play the game of Snake. With reinforcement learning (distributed A2C) it can learn to play a perfect game and score maximum points.

Overview

AI learning to play Snake game "from pixels" with Tensorflow 2.0.

Requirements

Python 2 and Tensorflow 2.0 Beta or later

Usage

To train AI, simply type:

$ python src/train.py

The agent can be trained multiple times. It will keep improving. Its state is saved automatically.

If you want to watch your trained AI playing the game:

$ python src/play.py

The repository contains a pre-trained AI (trained on 1 GPU + 12 CPUs). To watch it playing, type:

$ python src/play_pretrained.py

Implementation details

Implementation uses a distributed version of Advantage Actor-Critic method (A2C). It consists of two types of processes:

  • master process (1 instance): It owns the neural network model. It broadcasts network's weights to all "worker" processes (see below) and waits for mini-batches of experiences. Then it combines all the mini-batches and performs a network update using SGD. Then it broadcasts the current neural network's weights to workers again.
  • worker process (as many as number of cores): Each worker has its own copy of an A2C agent. Neural networks weights are received from "master" process (see above). Sample Snake games are played, a mini-batch of experiences is collected and sent back to master. Each worker then waits for an updated set of network's weights.

Neural network architecture:

  • Shared layers by both actor and critic: 4x convlutional layer (filters: 3x3, channels: 64).
  • Actor's head (policy head): 1x convolutional layer (filters: 1x1, channels: 2), followed by a fully connected layer (4 units, one per move: up, down, left, right)
  • Critic's head (value head): 1x convolutional layer (filters: 1x1, channels: 1), followed by a fully connected layer (64 units), followed by a fully connected layer (1 unit - state's value)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].