All Projects → xu-peng-tao → SSD-Pruning-and-quantization

xu-peng-tao / SSD-Pruning-and-quantization

Licence: MIT license
Pruning and quantization for SSD. Model compression.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to SSD-Pruning-and-quantization

Aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Stars: ✭ 453 (+2284.21%)
Mutual labels:  compression, pruning, quantization
Nncf
PyTorch*-based Neural Network Compression Framework for enhanced OpenVINO™ inference
Stars: ✭ 218 (+1047.37%)
Mutual labels:  compression, pruning, quantization
Model Optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Stars: ✭ 992 (+5121.05%)
Mutual labels:  compression, pruning, quantization
Lq Nets
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Stars: ✭ 195 (+926.32%)
Mutual labels:  compression, quantization
Zeroq
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
Stars: ✭ 150 (+689.47%)
Mutual labels:  compression, quantization
Hrank
Pytorch implementation of our CVPR 2020 (Oral) -- HRank: Filter Pruning using High-Rank Feature Map
Stars: ✭ 164 (+763.16%)
Mutual labels:  compression, pruning
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Stars: ✭ 264 (+1289.47%)
Mutual labels:  pruning, quantization
prunnable-layers-pytorch
Prunable nn layers for pytorch.
Stars: ✭ 47 (+147.37%)
Mutual labels:  compression, pruning
nuxt-prune-html
🔌⚡ Nuxt module to prune html before sending it to the browser (it removes elements matching CSS selector(s)), useful for boosting performance showing a different HTML for bots/audits by removing all the scripts with dynamic rendering
Stars: ✭ 69 (+263.16%)
Mutual labels:  pruning, prune
neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Stars: ✭ 666 (+3405.26%)
Mutual labels:  pruning, quantization
torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Stars: ✭ 126 (+563.16%)
Mutual labels:  pruning, quantization
fasterai1
FasterAI: A repository for making smaller and faster models with the FastAI library.
Stars: ✭ 34 (+78.95%)
Mutual labels:  compression, pruning
Model Quantization
Collections of model quantization algorithms
Stars: ✭ 118 (+521.05%)
Mutual labels:  compression, quantization
Model Compression And Acceleration Progress
Repository to track the progress in model compression and acceleration
Stars: ✭ 63 (+231.58%)
Mutual labels:  compression, pruning
DNNAC
All about acceleration and compression of Deep Neural Networks
Stars: ✭ 29 (+52.63%)
Mutual labels:  compression, quantization
torchprune
A research library for pytorch-based neural network pruning, compression, and more.
Stars: ✭ 133 (+600%)
Mutual labels:  compression, pruning
Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (+1073.68%)
Mutual labels:  pruning, quantization
Training extensions
Trainable models and NN optimization tools
Stars: ✭ 857 (+4410.53%)
Mutual labels:  ssd, quantization
Dynamic Model Pruning with Feedback
Implement of Dynamic Model Pruning with Feedback with pytorch
Stars: ✭ 25 (+31.58%)
Mutual labels:  pruning, prune
ATMC
[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”
Stars: ✭ 41 (+115.79%)
Mutual labels:  pruning, quantization

SSD-Pruning and quantization

1,在SSD上实现模型压缩:剪枝和量化

2,模型压缩支持多backbone(目前包括mobile-netv2-SSD、vgg16-BN-SSD),并容易扩展到其他backbone

3,SSD源码来自于lufficc/SSD,剪枝方法参考SpursLipu /YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone, 量化方法参考666DZY666/model-compression ,在此致谢。

4,项目环境:Python 3.6.10 ;Torch 1.4.0。其他环境配置参考:lufficc/SSD

Dataset

COCO

Microsoft COCO: Common Objects in Context

Download COCO 2014

sh ssd/data/datasets/scripts/COCO2014.sh

VOC Dataset

PASCAL VOC: Visual Object Classes

Download VOC2007 trainval & test

sh ssd/data/datasets/scripts/VOC2007.sh

Download VOC2012 trainval

sh ssd/data/datasets/scripts/VOC2012.sh

oxford hand

原始数据集可由官网下载,本项目将数据集格式进行转化。转换格式的数据集可在百度网盘(提取码:w4av))下载。下载解压后得到images和labels两个文件夹,然后configs/oxfordhand.data中的对应路径更换成解压后文件的路径即可。

For more dataset

可以将新的数据集变为oxford hand数据集格式,建立对应的.names和.data文件即可。

Backbone

原始lufficc/SSD中有多backbone的实现,本代码依然兼容。这里为便于扩展新的backbone和便于模型剪枝,采用ultralytics/yolov3中cfg文件的形式定义backbone。目前支持vgg16-BN、vgg-BN-fpga、mobilenet_v2,可以添加新的cfg文件以支持更多的backbone(若有新的结构,需要在ssd/modeling/backbone/backbone_cfg.py中进行添加定义)。当定义了新的cfg文件,可定义对应的yaml文件,使用test_model_structure.py打印模型结构、使用get_model_size得到模型规模。

Train

one gpu:
CUDA_VISIBLE_DEVICES="2" python train.py --config-file configs/*.yaml
tow gpu:
export NGPUS=2
CUDA_VISIBLE_DEVICES="2,3" python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/*.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000 

Evaluate

TEST.BN_FUSE True 表示测试时对BN进行融合。

CUDA_VISIBLE_DEVICES="2" python test.py --config-file configs/*.yaml TEST.BN_FUSE True

Demo

CUDA_VISIBLE_DEVICES="2" python demo.py --config-file configs/*.yaml --ckpt /path_to/*.pth --dataset_type oxfordhand --score_threshold 0.4 TEST.BN_FUSE True

Prune

剪枝方法来源于论文Learning Efficient Convolutional Networks through Network Slimming,剪枝无需微调方法来源于Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

剪枝分为3步:

1,在yaml文件中定义PRUNE的TYPE和SR:TYPE为剪枝的类型,可以从'normal'和'shortcut'中选择,'normal'为正常剪枝(不对shortcut进行剪枝),'shortcut'为极限剪枝(对shortcut进行剪枝,剪枝率高)。SR为稀疏因子大小。如configs/mobile_v2_ssd_hand_normal_sparse.yaml和configs/mobile_v2_ssd_hand_shortcut_sparse.yaml所示。

2,进行稀疏化训练:

可以从头开始训练,也可以从之前非稀疏化的权重开始训练:在yaml文件中设置MODEL.FINE_TUNE和MODEL.WEIGHTS。

one gpu:
CUDA_VISIBLE_DEVICES="3" python train.py --config-file configs/*_sparse.yaml
two:
export NGPUS=2
CUDA_VISIBLE_DEVICES="2,3" python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/*_sparse.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000 

3,进行剪枝:

--percent 为剪枝率,若进行较高剪枝率的剪枝,需要做好稀疏化训练(训练轮数较长,稀疏因子设置得当)

--regular 取0或1,0表示不进行规整剪枝,1表示进行规整剪枝。规整剪枝使剪枝后的通道数均为8的倍数,主要用于硬件部署。只对normal剪枝设置有效。

--max 取0或1。有时针对某个特定数据集,backbone并不需要输出全部的分支到predict head,当有一层BN的权重全部小于阙值,认为后面的层都没有用。当取1时,会把后面的层剪掉。

CUDA_VISIBLE_DEVICES="3" python prune.py --config-file configs/*_sparse.yaml --regular 0 --max 0 --percent 0.1 --model model_final.pth

注:本项目提供的剪枝策略,从理论上不需要进行剪枝后微调。但经实验,若采用较大的剪枝率,mAP掉的很多的情况下,微调仍会起到很重要的作用。

Quantization

参考论文:

BinarizedNeuralNetworks: TrainingNeuralNetworkswithWeightsand ActivationsConstrainedto +1 or−1

XNOR-Net:ImageNetClassificationUsingBinary ConvolutionalNeuralNetworks

Ternary weight networks

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Quantizing deep convolutional networks for efficient inference: A whitepaper

量化分为3步:

1,新建量化backbone网络cfg文件,将需要量化的层quantization=1,如configs/vgg_bn_ssd300_fpga_quan.cfg。

2,在yaml文件中定义QUANTIZATION的TYPE、FINAL、WBITS、ABITS:TYPE为量化的类型,可以从'dorefa'、'IAO'、'BWN'中选择。FINAL表示predict head是否量化。WBITS、ABITS为量化的位数(dorefa\IAO)或量化为几值(BWN,BWN支持权重二/三值 、激活二值)。如configs/vgg_bn_ssd300_hand_fpga_sparse_quan_w8a8.yaml等。

3,进行量化训练:

one gpu:
CUDA_VISIBLE_DEVICES="3" python train.py --config-file configs/*.yaml
two:
export NGPUS=2
CUDA_VISIBLE_DEVICES="2,3" python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/*.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000 

Pruning and quantization

先量化后剪枝:量化训练时同时进行稀疏化训练,如configs/vgg_bn_ssd300_hand_fpga_sparse_quan_w8a8.yaml。然后直接进行剪枝。量化后剪枝目前只支持dorefa匹配normal方法。

先剪枝后量化:剪枝完后得到剪枝后的网络cfg文件、txt文件(在pruned_configs文件夹下)和权重文件(在pruned_model_weights文件夹下),根据它们定义yaml文件进行量化训练。

Get weights

可使用get_weights.py和get_weights_bin.py得到模型参数用于模型部署。

Experiment

部分实验训练、测试等具体命令可见experiment.md。部分实验结果可见result.md

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].