All Projects → omkarv → pong-from-pixels

omkarv / pong-from-pixels

Licence: MIT license
Training a Neural Network to play Pong from pixels

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pong-from-pixels

A3c continuous
A continuous action space version of A3C LSTM in pytorch plus A3G design
Stars: ✭ 223 (+792%)
Mutual labels:  openai-gym
ddp-gym
Differential Dynamic Programming controller operating in OpenAI Gym environment.
Stars: ✭ 70 (+180%)
Mutual labels:  openai-gym
deep rl acrobot
TensorFlow A2C to solve Acrobot, with synchronized parallel environments
Stars: ✭ 32 (+28%)
Mutual labels:  openai-gym
awesome-isaac-gym
A curated list of awesome NVIDIA Issac Gym frameworks, papers, software, and resources
Stars: ✭ 373 (+1392%)
Mutual labels:  openai-gym
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+788%)
Mutual labels:  openai-gym
a3c-super-mario-pytorch
Reinforcement Learning for Super Mario Bros using A3C on GPU
Stars: ✭ 35 (+40%)
Mutual labels:  openai-gym
Gymfc
A universal flight control tuning framework
Stars: ✭ 210 (+740%)
Mutual labels:  openai-gym
rl trading
No description or website provided.
Stars: ✭ 14 (-44%)
Mutual labels:  openai-gym
deep-rl-docker
Docker image with OpenAI Gym, Baselines, MuJoCo and Roboschool, utilizing TensorFlow and JupyterLab.
Stars: ✭ 44 (+76%)
Mutual labels:  openai-gym
gym-mtsim
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
Stars: ✭ 196 (+684%)
Mutual labels:  openai-gym
yarll
Combining deep learning and reinforcement learning.
Stars: ✭ 84 (+236%)
Mutual labels:  openai-gym
RLGC
An open-source platform for applying Reinforcement Learning for Grid Control (RLGC)
Stars: ✭ 85 (+240%)
Mutual labels:  openai-gym
mario
Super Mario Reinforcement Learning from Demonstration
Stars: ✭ 25 (+0%)
Mutual labels:  openai-gym
Ma Gym
A collection of multi agent environments based on OpenAI gym.
Stars: ✭ 226 (+804%)
Mutual labels:  openai-gym
gym-R
An R package providing access to the OpenAI Gym API
Stars: ✭ 21 (-16%)
Mutual labels:  openai-gym
Ns3 Gym
ns3-gym - The Playground for Reinforcement Learning in Networking Research
Stars: ✭ 221 (+784%)
Mutual labels:  openai-gym
gym-rs
OpenAI's Gym written in pure Rust for blazingly fast performance
Stars: ✭ 34 (+36%)
Mutual labels:  openai-gym
gym-cartpole-swingup
A simple, continuous-control environment for OpenAI Gym
Stars: ✭ 20 (-20%)
Mutual labels:  openai-gym
iroko
A platform to test reinforcement learning policies in the datacenter setting.
Stars: ✭ 55 (+120%)
Mutual labels:  openai-gym
robo-gym-robot-servers
Repository containing Robot Servers ROS packages
Stars: ✭ 25 (+0%)
Mutual labels:  openai-gym

Introduction

This repo trains a Reinforcement Learning Neural Network so that it's able to play Pong from raw pixel input.

I've written up a blog post which walks through the code here and the basic principles of Reinforcement Learning, with Pong as the guiding example.

It is largely based on a Gist by Andrej Karpathy, which in turn is based on the Playing Atari with Deep Reinforcement Learning paper by Mnih et al.

This script uses the Open AI Gym environments in order to run the Atari emulator and environments, and currently uses no external ML framework & only numpy.

The AI Agent Pong in action

Prior to training (mostly random actions)

Prior to training (mostly random actions)

After training base repo + learning rate modification

After training

The agent that played this game was trained for ~12000 episodes (basically 12000 episodes of 'best-of-21' rounds) over a period of ~ 15 hours, on a Macbook Pro 2018 with 2.6GHz i7 (6 cores). The running mean score per episode, over the trailing 100 episodes, at the point I stopped training was -5, i.e. the CPU would win each episode 21-16 on average.

Hyperparameters:

  • Default except for learning-rate 1e-3

After training base repo + learning rate modification + a bugfix

A minor fix was added which crops more of the image vs the base repo, by removing noisy parts of the image where we can safely ignore the ball motion. This boosted the observed performance and speed at which the AI beat the CPU on average (i.e. when the average reward for an episode exceeded 0)

Hyperparameters:

  • Default except for learning-rate 1e-3

The agent that played this game was trained for ~10000 episodes (basically 10000 episodes of 'best-of-21' rounds) over a period of ~ 13 hours, on a Macbook Pro 2018 with 2.6GHz i7 (6 cores). The running mean score per episode, over the trailing 100 episodes, at the point I stopped training was 2.5, i.e. the trained AI Agent would win each episode 21 points to 18.5.

Training for another 10 hours & another 5000 episodes allowed the trained AI Agent to reach a running mean score per epsisode of 5, i.e. the trained AI Agent would win each episode 21 points to 16.

Graph of reward over time - first 10000 episodes of training Reward over time with bugfix

Graph of reward over time - 10000 to 15000 episodes of training

Reward over time after 10000 episodes

Modifications vs Source Gist

  • Records output video of the play
  • Modified learning rate from 1e-4 to 1e-3
  • Comments for clarity
  • Minor fix which crops more of the image vs the base repo

Installation Requirements

The instructions below are for Mac OS & assume you have Homebrew installed.

  • You'll need to run the code with Python 2.7 - I recommend the use of conda to manage python environments
  • Install Open AI Gym brew install gym
  • Install Cmake brew install cmake
  • Install ffmpeg brew install ffmpeg - Required for monitoring / videos
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].