All Projects → mdibaiee → Flappy Es

mdibaiee / Flappy Es

Flappy Bird AI using Evolution Strategies

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Flappy Es

Free Ai Resources
🚀 FREE AI Resources - 🎓 Courses, 👷 Jobs, 📝 Blogs, 🔬 AI Research, and many more - for everyone!
Stars: ✭ 192 (+37.14%)
Mutual labels:  artificial-intelligence, reinforcement-learning, unsupervised-learning
He4o
和(he for objective-c) —— “信息熵减机系统”
Stars: ✭ 284 (+102.86%)
Mutual labels:  artificial-intelligence, reinforcement-learning, unsupervised-learning
Awesome Artificial Intelligence
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
Stars: ✭ 6,516 (+4554.29%)
Mutual labels:  artificial-intelligence, reinforcement-learning, unsupervised-learning
Complete Life Cycle Of A Data Science Project
Complete-Life-Cycle-of-a-Data-Science-Project
Stars: ✭ 140 (+0%)
Mutual labels:  reinforcement-learning, unsupervised-learning
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (+688.57%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Awesome Decision Making Reinforcement Learning
A selection of state-of-the-art research materials on decision making and motion planning.
Stars: ✭ 68 (-51.43%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+5542.86%)
Mutual labels:  artificial-intelligence, unsupervised-learning
Mapleai
AI各领域学习资料整理。(A collection of all skills and knowledges should be got command of to obtain an AI relevant job offer. There are online blogs, my personal blogs, electronic books copy.)
Stars: ✭ 89 (-36.43%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Snake
Artificial intelligence for the Snake game.
Stars: ✭ 1,241 (+786.43%)
Mutual labels:  artificial-intelligence, reinforcement-learning
60 days rl challenge
60_Days_RL_Challenge中文版
Stars: ✭ 92 (-34.29%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Chemgan Challenge
Code for the paper: Benhenda, M. 2017. ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity? arXiv preprint arXiv:1708.08227.
Stars: ✭ 98 (-30%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Hypergan
Composable GAN framework with api and user interface
Stars: ✭ 1,104 (+688.57%)
Mutual labels:  artificial-intelligence, unsupervised-learning
Learning2run
Our NIPS 2017: Learning to Run source code
Stars: ✭ 57 (-59.29%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Ai Reading Materials
Some of the ML and DL related reading materials, research papers that I've read
Stars: ✭ 79 (-43.57%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Notebooks
Some notebooks
Stars: ✭ 53 (-62.14%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Simulator
A ROS/ROS2 Multi-robot Simulator for Autonomous Vehicles
Stars: ✭ 1,260 (+800%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Rlai Exercises
Exercise Solutions for Reinforcement Learning: An Introduction [2nd Edition]
Stars: ✭ 97 (-30.71%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Reinforcement Learning Cheat Sheet
Reinforcement Learning Cheat Sheet
Stars: ✭ 104 (-25.71%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Toycarirl
Implementation of Inverse Reinforcement Learning Algorithm on a toy car in a 2D world problem, (Apprenticeship Learning via Inverse Reinforcement Learning Abbeel & Ng, 2004)
Stars: ✭ 128 (-8.57%)
Mutual labels:  artificial-intelligence, reinforcement-learning
Student Teacher Anomaly Detection
Student–Teacher Anomaly Detection with Discriminative Latent Embeddings
Stars: ✭ 43 (-69.29%)
Mutual labels:  artificial-intelligence, unsupervised-learning

Playing Flappy Bird using Evolution Strategies

After reading Evolution Strategies as a Scalable Alternative to Reinforcement Learning, I wanted to experiment something using Evolution Strategies, and Flappy Bird has always been one of my favorites when it comes to Game experiments. A simple yet challenging game.

The model learns to play very well after 3000 epochs, but not completely flawless and it rarely loses in difficult cases (high difference between two wall entrances). Training process is pretty fast as there is no backpropagation, and is not very costy in terms of memory as there is no need to record actions as in policy gradients.

Here is a demonstration of the model after 3000 epochs (~5 minutes on an Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz):

after training

Before training:

Before training

There is also a a web version available for ease of access.

For each frame the bird stays alive, +0.1 score is given to him. For each wall he passes, +10 score is given.

Demonstration of rewards for individuals and the mean reward over time (y axis is logarithmic): reward chart

Try it yourself

You need python3.5 and pip for installing and running the code.

First, install dependencies (you might want to create a virtualenv):

pip install -r requirements

The pretrained parameters are in a file named load.npy and will be loaded when you run train.py or demo.py.

train.py will train the model, saving the parameters to saves/<TIMESTAMP>/save-<ITERATION>.

demo.py shows the game in a GTK window so you can see how the AI actually plays (like the GIF above).

play.py if you feel like playing the game yourself, space: jump, once lost, press enter to play again. 😁

pro tip: reach 100 score and you will become THUG FOR LIFE 🚬

Notes

It seems training past a maximum point reduces performance, learning rate decay might help with that. My interpretation is that after finding a local maximum for accumulated reward and being able to receive high rewards, the updates become pretty large and will pull the model too much to sides, thus the model will enter a state of oscillation.

To try it yourself, there is a long.npy file, rename it to load.npy (backup load.npy before doing so) and run demo.py, you will see the bird failing more often than not. long.py was trained for only 100 more epochs than load.npy.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].