All Projects → AndersonJo → Dqn Pytorch

AndersonJo / Dqn Pytorch

Deep Q Learning via Pytorch

Projects that are alternatives of or similar to Dqn Pytorch

Kaggle Competitions
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Deconfounder tutorial
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Matplotlib4papers
Matplotlib examples to present results.
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Efficientnet Gradcam Visualization
EfficientNet-GradCam Visualization
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Notebooks
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Machine Learning Notes
A repository to save my machine learning notes.
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Xcos
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Genetic Algorithm Rnn
Using Genetic Algorithms to optimize Recurrent Neural Network's Configuration
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Pneumonia Detection From Chest X Ray Images With Deep Learning
Detecting Pneumonia in Chest X-ray Images using Convolutional Neural Network and Pretrained Models
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Rmsync
A script for synchronizing the reMarkable e-reader
Stars: ✭ 63 (-1.56%)
Mutual labels:  jupyter-notebook
Pico
Object Detection and Analysis Made easy using Raspberry Pi, Apache Kafka, AWS Rekognition & Docker
Stars: ✭ 63 (-1.56%)
Mutual labels:  jupyter-notebook
Vehicle Trajectory Prediction On Ngsim
Stars: ✭ 63 (-1.56%)
Mutual labels:  jupyter-notebook
Otml ds3 2018
Practical sessions for the Optimal Transport and Machine learning course at DS3 2018
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Processamento Digital De Sinais Financeiros
Estabelecer competências em técnicas quantitativas aplicadas ao mercado de renda variável, por meio da aplicação dos métodos de processamento digital de séries temporais.
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Machine Learning In Finance
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Iba Paper Code
Code for the Paper "Restricting the Flow: Information Bottlenecks for Attribution"
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Indonesian Language Models
Indonesian Language Models and its Usage
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Text Top Model
Benchmarking text classification algorithms
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Deepbayes2017
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook
Net Analysis
Tools, libraries and applications to analyze network measurements and detect interference.
Stars: ✭ 64 (+0%)
Mutual labels:  jupyter-notebook

Deep Q-Learning with Pytorch

The implementation of Deep Q Learning with Pytorch.

  • Replay Memory
  • Simple Deep Q Learning (not using A3C or Dueling)
  • Support for original DQN (the paper in Nature published by DeepMind) and LSTM-based DQN
  • Used Pytorch
  • Frame Skipping
  • Target Network (for stability when training)
  • Python 3.x (I used Python 3.6)

DQN Algorithm

The linked article explains how DQN works in detail.
http://andersonjo.github.io/artificial-intelligence/2017/06/03/Deep-Reinforcement-Learning/

Actual Playing game

The below image is actual result of the code here.

alt text

Watch the video

Snake Game

Installation

Requirements

  1. Python 3
  2. Pytorch
  3. TorchVision
  4. gym
  5. opencv2
  6. Scipy
  7. ffmpeg (optional. if you want to record the gameplay)

Install PyGame

sudo pip3 install pygame

Install PyGame-Learning-Environment

git clone https://github.com/ntasfi/PyGame-Learning-Environment.git
cd PyGame-Learning-Environment/
sudo pip3 install -e .

Install Gym-Ple

git clone https://github.com/lusob/gym-ple.git
cd gym-ple/
sudo pip3 install -e .

Comparison

Algorithm Game Best Score
DQN FlappyBird 65
LSTM-based DQN FlappyBird 83
  • Best Score is the average value of 10 times of games.

How to use

Training

Before training, you need to make a "dqn_checkpoints" directory for saving model automatically.

mkdir dqn_checkpoints
python3 dqn.py --mode=train

Training LSTM-based DQN

mkdir dqn_checkpoints
python3 dqn.py --mode=train --model=lstm

Playing

It automatically loads the latest checkpoint (it loads saved model parameters).
But first, you need to train it.
If there is no checkpoint (You might have not trained it yet), the play is just simply random walk.

python3 dqn.py --mode=play

playing LSTM-based DQN is like..

python3 dqn.py --mode=play --model=lstm

Recoding

If you want to record game play, just do like this.

python3 dqn.py --mode=play --record 

How to convert video to GIF file

mkdir frames
ffmpeg -i flappybird.mp4 -qscale:v 2  -r 25 'frames/frame-%05d.jpg'
cd frames
convert -delay 4 -loop 0 *.jpg flappybird.gif

FFMpeg and Imagemagic(Convert command) have the following options.

-r 5 stands for FPS value
    for better quality choose bigger number
    adjust the value with the -delay in 2nd step
    to keep the same animation speed

-delay 20 means the time between each frame is 0.2 seconds
   which match 5 fps above.
   When choosing this value
       1 = 100 fps
       2 = 50 fps
       4 = 25 fps
       5 = 20 fps
       10 = 10 fps
       20 = 5 fps
       25 = 4 fps
       50 = 2 fps
       100 = 1 fps
       in general 100/delay = fps

-qscale:v n means a video quality level where n is a number from 1-31, 
   with 1 being highest quality/largest filesize and 
   31 being the lowest quality/smallest filesize.

-loop 0 means repeat forever
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].