All Projects → duggalrahul → Alexnet Experiments Keras

duggalrahul / Alexnet Experiments Keras

Licence: mit
Code examples for training AlexNet using Keras and Theano

Projects that are alternatives of or similar to Alexnet Experiments Keras

Yann
This toolbox is support material for the book on CNN (http://www.convolution.network).
Stars: ✭ 41 (-62.39%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks, theano
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-22.94%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Imageclassification
Deep Learning: Image classification, feature visualization and transfer learning with Keras
Stars: ✭ 83 (-23.85%)
Mutual labels:  jupyter-notebook, feature-extraction
Trained Ternary Quantization
Reducing the size of convolutional neural networks
Stars: ✭ 90 (-17.43%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (+977.98%)
Mutual labels:  jupyter-notebook, feature-extraction
Mlatimperial2017
Materials for the course of machine learning at Imperial College organized by Yandex SDA
Stars: ✭ 71 (-34.86%)
Mutual labels:  jupyter-notebook, theano
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-21.1%)
Mutual labels:  jupyter-notebook, feature-extraction
Gtsrb
Convolutional Neural Network for German Traffic Sign Recognition Benchmark
Stars: ✭ 65 (-40.37%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Fakeimagedetector
Image Tampering Detection using ELA and CNN
Stars: ✭ 93 (-14.68%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Deep Learning Python
Intro to Deep Learning, including recurrent, convolution, and feed forward neural networks.
Stars: ✭ 94 (-13.76%)
Mutual labels:  jupyter-notebook, theano
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-11.01%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Cnn Interpretability
🏥 Visualizing Convolutional Networks for MRI-based Diagnosis of Alzheimer’s Disease
Stars: ✭ 68 (-37.61%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Equivariant Transformers
Equivariant Transformer (ET) layers are image-to-image mappings that incorporate prior knowledge on invariances with respect to continuous transformations groups (ICML 2019). Paper: https://arxiv.org/abs/1901.11399
Stars: ✭ 68 (-37.61%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Age Gender Estimation
Keras implementation of a CNN network for age and gender estimation
Stars: ✭ 1,195 (+996.33%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Sru Deeplearning Workshop
دوره 12 ساعته یادگیری عمیق با چارچوب Keras
Stars: ✭ 66 (-39.45%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Breast Cancer Classification
Breast Cancer Classification using CNN and transfer learning
Stars: ✭ 86 (-21.1%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Shot Type Classifier
Detecting cinema shot types using a ResNet-50
Stars: ✭ 109 (+0%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Pneumonia Detection From Chest X Ray Images With Deep Learning
Detecting Pneumonia in Chest X-ray Images using Convolutional Neural Network and Pretrained Models
Stars: ✭ 64 (-41.28%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
Audio classification
CNN 1D vs 2D audio classification
Stars: ✭ 65 (-40.37%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks
3dunet abdomen cascade
Stars: ✭ 91 (-16.51%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks

My experiments with AlexNet, using Keras and Theano

A blog post accompanying this project can be found here.

Contents

  1. Motivation
  2. Requirements
  3. Experiments
  4. Results
  5. TO-DO
  6. License

Motivation

When I first started exploring deep learning (DL) in July 2016, many of the papers I read established their baseline performance using the standard AlexNet model. In part, this could be attributed to the several code examples readily available across all major Deep Learning libraries. Despite its significance, I could not find readily available code examples for training AlexNet in the Keras framework. Through this project, I am sharing my experience of training AlexNet in three very useful scenarios :-

  1. Training AlexNet end-to-end - Also known as training from scratch
  2. Fine-Tuning the pre-trained AlexNet - extendable to transfer learning
  3. Using AlexNet as a feature extractor - useful for training a classifier such as SVM on top of "Deep" CNN features.

I have re-used code from a lot of online resources, the two most significant ones being :-

  1. This blogpost by the creator of keras - Francois Chollet.
  2. This project by Heuritech, which has implemented the AlexNet architecture.

Requirements

This project is compatible with Python 2.7-3.5 Make sure you have the following libraries installed.

  1. Keras - A high level neural network library written in python. To install, follow the instructions available here.
  2. Theano - A python library to efficiently evaluate/optimize mathematical expressions. To install, follow the instructions available here.
  3. Anaconda - A package of python libraries which includes several that are absolutely useful for Machine Learning/Data Science. To install, follow the instructions available here.

Note : If you have a GPU in your machine, you might want to configure Keras and Theano to utilize its resources. For myself, running the code on a K20 GPU resulted in a 10-12x speedup.

Experiments

  • To perform the three tasks outlined in the motivation, first we need to get the dataset. We run our experiments on the dogs v/s cats training dataset available here.
  • We use 1000 images from each class for training and evaluate on 400 images from each class. Ensure that the images are placed as in the following directory structure.
Data/
    Train/
         cats/
            cat.0.jpg
            cat.1.jpg
            .
            .
            .
            cat.999.jpg
         dogs/
            dog.0.jpg
            dog.1.jpg
            .
            .
            .
            dog.999.jpg
     Test/
         cats/
            cat.0.jpg
            cat.1.jpg
            .
            .
            .
            cat.399.jpg
         dogs/
            dog.0.jpg
            dog.1.jpg
            .
            .
            .
            dog.399.jpg
  • Download the pre-trained weights for alexnet from here and place them in convnets-keras/weights/.
  • Once the dataset and weights are in order, navigate to the project root directory, and run the command jupyter notebook on your shell. This will open a new tab in your browser. Navigate to Code/ and open the file AlexNet_Experiments.ipynb.
  • Now you can execute each code cell using Shift+Enter to generate its output.

Results

Task 1 : Training from scratch

  1. Training AlexNet, using stochastic gradient descent with a fixed learning rate of 0.01, for 80 epochs, we acheive a test accuracy of ~84.5%.
  2. In accuracy plot shown below, notice the large gap between the training and testing curves. This suggests that our model is overfitting. This is usually a problem when we have few training examples (~2000 in our case). However, this problem can be partially addressed through finetuning a pre-trained network as we will see in the next subsection.

accuracy_scratch

Task 2 : Fine tuning a pre-trained AlexNet

  1. CNN's trained on small datasets usually suffer from the problem of overfitting. One of the solutions is to initialize your CNN with weights learnt on a very large dataset and then finetuning the weights on your dataset.
  2. Several papers talk about different strategies for fine-tuning. In this project, I execute the strategy proposed in this recent paper. The basic strategy is to train layer-wise. So if our network has 5 layers : L1,L2,...,L5. In the first round, we freeze L1-L4 and tune only L5. In the second round, we include L4 in the training. So L4-L5 are allowed to tune for some epochs. The third round includes L3 in the training. So now L3-L5 are tuned. Similarly the training percolates to previous layers.
  3. Training for 80 epochs, using the above strategy, we reach a test accuracy of ~89%. This is almost a 5% jump over training from scratch. The test error plot is shown below.

accuracy_finetune

  1. To compare fine-tuning v/s training from scratch, we plot the test accuracies for fine-tuning (Task 2) v/s training from scratch (Task 1) below. Notice how much the accuracy curve for fine-tuning stays above the plot for task 1.

finetune_vs_scratch_accuracy1

Task 3 : Using AlexNet as a feature extractor

  1. We train a small ANN consisting of 256 neurons on the features extracted from the last convolutional layer. After training for 80 epochs, we got a test accuracy of ~83%. This is almost as much as the accuracy of AlexNet trained from scratch.
  2. The test accuracy plot shown below reveals massive overfitting as was the case in Task-1.

feature_extraction_convpool_5_accuracy1

TO-DO

  1. The mean subtraction layer (look inside Code/alexnet_base.py) currently uses a theano function - set_subtensor. This introduces a dependancy to install Theano. I would ideally like to use a keras wrapper function which works for both Theano and Tensorflow backends. I'm not sure if such a wrapper exists though. Any suggestions for the corresponding Tensorflow function, so that I could write the Keras wrapper myself?
  2. Use this code to demonstrate performance on a dataset that is significantly different from ImageNet. Maybe a medical imaging dataset?

License

This code is released under the MIT License (refer to the LICENSE file for details).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].