All Projects → submission2019 → Cnn Quantization

submission2019 / Cnn Quantization

Quantization of Convolutional Neural networks.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Cnn Quantization

Trained Ternary Quantization
Reducing the size of convolutional neural networks
Stars: ✭ 90 (-36.17%)
Mutual labels:  convolutional-neural-networks, quantization
Hey Waldo
Labeled images of the Where's Waldo puzzle for use in classification and image recognition problems.
Stars: ✭ 138 (-2.13%)
Mutual labels:  convolutional-neural-networks
Intelegent lock
lock mechanism with face recognition and liveness detection
Stars: ✭ 134 (-4.96%)
Mutual labels:  convolutional-neural-networks
Lung Segmentation 2d
Lung fields segmentation on CXR images using convolutional neural networks.
Stars: ✭ 138 (-2.13%)
Mutual labels:  convolutional-neural-networks
Chainer Cifar10
Various CNN models for CIFAR10 with Chainer
Stars: ✭ 134 (-4.96%)
Mutual labels:  convolutional-neural-networks
Awesome Edge Machine Learning
A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
Stars: ✭ 139 (-1.42%)
Mutual labels:  quantization
Lsoftmax Pytorch
The Pytorch Implementation of L-Softmax
Stars: ✭ 133 (-5.67%)
Mutual labels:  convolutional-neural-networks
Ctranslate2
Fast inference engine for OpenNMT models
Stars: ✭ 140 (-0.71%)
Mutual labels:  quantization
Abnormal event detection
Abnormal Event Detection in Videos using SpatioTemporal AutoEncoder
Stars: ✭ 139 (-1.42%)
Mutual labels:  convolutional-neural-networks
Easycnn
easy convolution neural network
Stars: ✭ 136 (-3.55%)
Mutual labels:  convolutional-neural-networks
Deep Steganography
Hiding Images within other images using Deep Learning
Stars: ✭ 136 (-3.55%)
Mutual labels:  convolutional-neural-networks
Graffitist
Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow
Stars: ✭ 135 (-4.26%)
Mutual labels:  quantization
Image classifier
CNN image classifier implemented in Keras Notebook 🖼️.
Stars: ✭ 139 (-1.42%)
Mutual labels:  convolutional-neural-networks
Imagenet
Pytorch Imagenet Models Example + Transfer Learning (and fine-tuning)
Stars: ✭ 134 (-4.96%)
Mutual labels:  convolutional-neural-networks
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+1224.82%)
Mutual labels:  convolutional-neural-networks
Deep Learning With Pytorch Tutorials
深度学习与PyTorch入门实战视频教程 配套源代码和PPT
Stars: ✭ 1,986 (+1308.51%)
Mutual labels:  convolutional-neural-networks
Reproduce Chexnet
Reproduce CheXNet
Stars: ✭ 136 (-3.55%)
Mutual labels:  convolutional-neural-networks
Wsddn
Weakly Supervised Deep Detection Networks (CVPR 2016)
Stars: ✭ 138 (-2.13%)
Mutual labels:  convolutional-neural-networks
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (+0%)
Mutual labels:  convolutional-neural-networks
Bender
Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.
Stars: ✭ 1,728 (+1125.53%)
Mutual labels:  convolutional-neural-networks

cnn-quantization

Dependencies

HW requirements

NVIDIA GPU / cuda support

Data

  • To run this code you need validation set from ILSVRC2012 data
  • Configure your dataset path by providing --data "PATH_TO_ILSVRC" or copy ILSVRC dir to ~/datasets/ILSVRC2012.
  • To get the ILSVRC2012 data, you should register on their site for access: http://www.image-net.org/

Prepare environment

  • Clone source code
git clone https://github.com/submission2019/cnn-quantization.git
cd cnn-quantization
  • Create virtual environment for python3 and activate:
virtualenv --system-site-packages -p python3 venv3
. ./venv3/bin/activate
  • Install dependencies
pip install torch torchvision bokeh pandas sklearn mlflow tqdm

Building cuda kernels for GEMMLOWP

To improve performance GEMMLOWP quantization was implemented in cuda and requires to compile kernels.

  • build kernels
cd kernels
./build_all.sh
cd ../

Run inference experiments

Post-training quantization of Res50

Note that accuracy results could have 0.5% variance due to data shuffling.

  • Experiment W4A4 naive:
python inference/inference_sim.py -a resnet50 -b 512 -pcq_w -pcq_a -sh --qtype int4 -qw int4
  • Experiment W4A4 + ACIQ + Bit Alloc(A) + Bit Alloc(W) + Bias correction:
python inference/inference_sim.py -a resnet50 -b 512 -pcq_w -pcq_a -sh --qtype int4 -qw int4 -c laplace -baa -baw -bcw

experiments

ACIQ: Analytical Clipping for Integer Quantization

We solve eq. 6 numerically to find optimal clipping value α for both Laplace and Gaussian prior.
eq-6

Solving eq. 6 numerically for bit-widths 2,3,4 results with optimal clipping values of 2.83b, 3.86b, 5.03*b respectively, where b is deviation from expected value of the activation.

Numerical solution source code: mse_analysis.py aciq-mse

Per-channel bit allocation

Given a quota on the total number of bits allowed to be written to memory, the optimal bit width assignment Mi for channel i according to eq. 11.
eq-11
bit_allocation_synthetic.py
bit-alloc

Bias correction

We observe an inherent bias in the mean and the variance of the weight values following their quantization.
bias_correction.ipynb
bias-err
We calculate this bias using equation 12.
eq-12
Then, we compensate for the bias for each channel of W as follows:
eq-13

Quantization

We use GEMMLOWP quantization scheme described here. We implemented above quantization scheme in pytorch. We optimize this scheme by applying ACIQ to reduce range and optimally allocate bits for each channel.

Quantization code can be found in int_quantizer.py

Additional use cases and experiments

Inference using offline statistics

Collect statistics on 32 images

python inference/inference_sim.py -a resnet50 -b 1 --qtype int8 -sm collect -ac -cs 32

Run inference experiment W4A4 + ACIQ + Bit Alloc(A) + Bit Alloc(W) + Bias correction using offline statistics.

python inference/inference_sim.py -a resnet50 -b 512 -pcq_w -pcq_a --qtype int4 -qw int4 -c laplace -baa -baw -bcw -sm use

4-bit quantization with clipping thresholds of 2 std

python inference/inference_sim.py -a resnet50 -b 512 -pcq_w -pcq_a -sh --qtype int4 -c 2std

ACIQ with layer wise quantization

python inference/inference_sim.py -a resnet50 -b 512 --qtype int4 -c laplace -sm use

Bin allocation and Variable Length Codding

Given a quota on the total number of bits allowed to be written to memory, the optimal number of bins Bi for channel i derived from eq. 10.
eq-10

We evaluate the effect of huffman codding on activations and weights by mesuaring average entropy on all layers.

python -a vgg16 -b 32 --device_ids 4 -pcq_w -pcq_a -sh --qtype int4 -qw int4 -c laplace -baa -baw -bcw -bata 5.3 -batw 5.3 -mtq -me -ss 1024

Average bit rate: avg.entropy.act - 2.215521374096473

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].