All Projects → mattwang44 → LeNet-from-Scratch

mattwang44 / LeNet-from-Scratch

Licence: other
Implementation of LeNet5 without any auto-differentiate tools or deep learning frameworks. Accuracy of 98.6% is achieved on MNIST dataset.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to LeNet-from-Scratch

digitRecognition
Implementation of a digit recognition using my Neural Network with the MNIST data set.
Stars: ✭ 21 (-4.55%)
Mutual labels:  mnist
tensorflow-example
Tensorflow-example:使用MNIST训练模型,并识别手写数字图片
Stars: ✭ 26 (+18.18%)
Mutual labels:  mnist
tensorflow-mnist-MLP-batch normalization-weight initializers
MNIST classification using Multi-Layer Perceptron (MLP) with 2 hidden layers. Some weight-initializers and batch-normalization are implemented.
Stars: ✭ 49 (+122.73%)
Mutual labels:  mnist
KerasMNIST
Keras MNIST for Handwriting Detection
Stars: ✭ 25 (+13.64%)
Mutual labels:  mnist
cnn open
A hardware implementation of CNN, written by Verilog and synthesized on FPGA
Stars: ✭ 157 (+613.64%)
Mutual labels:  lenet
catacomb
The simplest machine learning library for launching UIs, running evaluations, and comparing model performance.
Stars: ✭ 13 (-40.91%)
Mutual labels:  mnist
handwritten-digit-recognition-tensorflowjs
In-Browser Digit recognition with Tensorflow.js and React using Mnist dataset
Stars: ✭ 40 (+81.82%)
Mutual labels:  mnist
cDCGAN
PyTorch implementation of Conditional Deep Convolutional Generative Adversarial Networks (cDCGAN)
Stars: ✭ 49 (+122.73%)
Mutual labels:  mnist
tensorflow-mnist-AAE
Tensorflow implementation of adversarial auto-encoder for MNIST
Stars: ✭ 86 (+290.91%)
Mutual labels:  mnist
Pytorch-PCGrad
Pytorch reimplementation for "Gradient Surgery for Multi-Task Learning"
Stars: ✭ 179 (+713.64%)
Mutual labels:  mnist
Bounding-Box-Regression-GUI
This program shows how Bounding-Box-Regression works in a visual form. Intersection over Union ( IOU ), Non Maximum Suppression ( NMS ), Object detection, 边框回归,边框回归可视化,交并比,非极大值抑制,目标检测。
Stars: ✭ 16 (-27.27%)
Mutual labels:  mnist
gans-2.0
Generative Adversarial Networks in TensorFlow 2.0
Stars: ✭ 76 (+245.45%)
Mutual labels:  mnist
digit recognizer
CNN digit recognizer implemented in Keras Notebook, Kaggle/MNIST (0.995).
Stars: ✭ 27 (+22.73%)
Mutual labels:  mnist
VAE-Latent-Space-Explorer
Interactive exploration of MNIST variational autoencoder latent space with React and tensorflow.js.
Stars: ✭ 30 (+36.36%)
Mutual labels:  mnist
cluttered-mnist
Experiments on cluttered mnist dataset with Tensorflow.
Stars: ✭ 20 (-9.09%)
Mutual labels:  mnist
Fun-with-MNIST
Playing with MNIST. Machine Learning. Generative Models.
Stars: ✭ 23 (+4.55%)
Mutual labels:  mnist
Hand-Digits-Recognition
Recognize your own handwritten digits with Tensorflow, embedded in a PyQT5 GUI. The Neural Network was trained on MNIST.
Stars: ✭ 11 (-50%)
Mutual labels:  mnist
BP-Network
Multi-Classification on dataset of MNIST
Stars: ✭ 72 (+227.27%)
Mutual labels:  mnist
DCGAN-Pytorch
A Pytorch implementation of "Deep Convolutional Generative Adversarial Networks"
Stars: ✭ 23 (+4.55%)
Mutual labels:  mnist
pytorch-siamese-triplet
One-Shot Learning with Triplet CNNs in Pytorch
Stars: ✭ 74 (+236.36%)
Mutual labels:  mnist

LeNet5 Implementation FROM SCRATCH

This is an implementation of LeNet5 from Yann LeCun's paper in 1998, using Numpy & OOP only (without any auto-differentiate tools or deep learning frameworks).

Yann LeCun's demo in 1993:

LeNet demo

Result of Training

Highest accuracy of 98.6% on MNIST testing dataset has achieved in 20 epoches of training (93.5% after 1st epoch). The training (20 epoches, batch size = 256) takes about 2 hours using CPU only (3.5 hours if evaluate after each epoch).

Feature maps in each layer:

File Structure

LeNet5_from_scratch/
├── LeNet5_train.ipynb                 # Notebook for training and shows the results
├── RBF_initial_weight.ipynb           # Notebook shows the fixed weight (ASCII bitmap) in the RBF layer
├── ExeSpeedTest.ipynb                 # Comparison of different version of Conv. & Pooling functions
├── Best_model.pkl                     # The model with 98.6% accuracy both on training and testing data 
│                                      # Please download at [tinyurl.com/mrybvje9] or train one by yourself :)
│
├── MNIST_auto_Download.py             # Python script for auto-download MNIST dataset (like folder below)
├── MNIST/                             # Folder contains MNIST training and testing data
│   ├── train-images-idx3-ubyte        # Training images
│   ├── train-labels-idx1-ubyte        # Training labels
│   ├── t10k-images-idx3-ubyte         # Testing images
│   └── t10k-labels-idx1-ubyte         # Testing labels
│
└── utils/
    ├── __init__.py 
    ├── Convolution_util.py            # Convolution forward and backward
    ├── Pooling_util.py                # Pooling forward and backward
    ├── Activation_util.py             # Activation functions
    ├── utils_func.py                  # Other functions like normalize(), initialize(), zero_pad(), etc
    ├── RBF_initial_weight.py          # Setting fixed weight (ASCII bitmap) in the RBF layer
    └── LayerObjects.py                # All the layer objects

Structure of ConvNet

The structure in the original paper is:

The structure used in this repo have a few modification:

  1. Substitute the sub-sampling with average pooling, which is more accpetable choice without trainable parameters in the layer and needless to be followed by an activation funciton. (I've tried using max-pooling, but it blurs the feature maps in this case and gives low accuracy.)

  2. momentum optimizer (momentum=0.9) is used to accelerate the training process (for faster convergence).

Bug Alert

Stochastic Diagonal Levenberg-Marquaedt method from the original paper is also used in this implementation to determine the learning rate for each trainable layer. However, resulting range of learning rates is much smaller than the one given in the paper (maybe bugs exist in the SDLM code). Therefore, 100x original global learning rates is applied and it works fine then.

Reference

  1. Yann LeCun's paper
    • Masterpiece of CNN. Still so much knowledge that I don't fully understand even after this project.
  2. Marcel Wang's blog
    • Special thanks to Marcel Wang for encouraging everyone to do this project.
  3. Deep Learning Specialization by Andrew Ng
    • Epic lectures & inspiring assignments. Couldn't done this if I didn't take the courses.
  4. agjayant's repo
  5. HiCraigChen's repo

Todo list

  1. Compare RBF layer with softmax layer (cross entropy) or simply a FC layer
  2. Accelerate with Cython or PyCuda
  3. Try using sub-sampling layer
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].