All Projects → Zhen-Dong → Hawq

Zhen-Dong / Hawq

Licence: mit
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Hawq

Tf2
An Open Source Deep Learning Inference Engine Based on FPGA
Stars: ✭ 113 (+4.63%)
Mutual labels:  quantization, model-compression
Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (+106.48%)
Mutual labels:  quantization, model-compression
Awesome Ml Model Compression
Awesome machine learning model compression research papers, tools, and learning material.
Stars: ✭ 166 (+53.7%)
Mutual labels:  quantization, model-compression
Pretrained Language Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Stars: ✭ 2,033 (+1782.41%)
Mutual labels:  model-compression, quantization
ATMC
[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”
Stars: ✭ 41 (-62.04%)
Mutual labels:  quantization, model-compression
Kd lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Stars: ✭ 173 (+60.19%)
Mutual labels:  quantization, model-compression
Micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Stars: ✭ 1,232 (+1040.74%)
Mutual labels:  quantization, model-compression
ZAQ-code
CVPR 2021 : Zero-shot Adversarial Quantization (ZAQ)
Stars: ✭ 59 (-45.37%)
Mutual labels:  quantization, model-compression
torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Stars: ✭ 126 (+16.67%)
Mutual labels:  quantization, model-compression
BitPack
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
Stars: ✭ 36 (-66.67%)
Mutual labels:  quantization, model-compression
Awesome Automl And Lightweight Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Stars: ✭ 691 (+539.81%)
Mutual labels:  quantization, model-compression
Paddleslim
PaddleSlim is an open-source library for deep model compression and architecture search.
Stars: ✭ 677 (+526.85%)
Mutual labels:  quantization, model-compression
Model Optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Stars: ✭ 992 (+818.52%)
Mutual labels:  quantization, model-compression
Jacinto Ai Devkit
Training & Quantization of embedded friendly Deep Learning / Machine Learning / Computer Vision models
Stars: ✭ 49 (-54.63%)
Mutual labels:  quantization
Frostnet
FrostNet: Towards Quantization-Aware Network Architecture Search
Stars: ✭ 85 (-21.3%)
Mutual labels:  quantization
Awesome Knowledge Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
Stars: ✭ 1,031 (+854.63%)
Mutual labels:  model-compression
Awesome Pruning
A curated list of neural network pruning resources.
Stars: ✭ 1,017 (+841.67%)
Mutual labels:  model-compression
Jupiter
Jupiter是一款性能非常不错的, 轻量级的分布式服务框架
Stars: ✭ 1,372 (+1170.37%)
Mutual labels:  hessian
Pyepr
Powerful, automated analysis and design of quantum microwave chips & devices [Energy-Participation Ratio and more]
Stars: ✭ 81 (-25%)
Mutual labels:  quantization
Compress
Compressing Representations for Self-Supervised Learning
Stars: ✭ 43 (-60.19%)
Mutual labels:  model-compression



HAWQ: Hessian AWare Quantization

HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform quantization, with direct hardware implementation through TVM.

For more details please see:

Installation

  • PyTorch version >= 1.4.0
  • Python version >= 3.6
  • For training new models, you'll also need NVIDIA GPUs and NCCL
  • To install HAWQ and develop locally:
git clone https://github.com/Zhen-Dong/HAWQ.git
cd HAWQ
pip install -r requirements.txt

Getting Started

Quantization-Aware Training

An example to run uniform 8-bit quantization for resnet50 on ImageNet.

export CUDA_VISIBLE_DEVICES=0
python quant_train.py -a resnet50 --epochs 1 --lr 0.0001 --batch-size 128 --data /path/to/imagenet/ --pretrained --save-path /path/to/checkpoints/ --act-range-momentum=0.99 --wd 1e-4 --data-percentage 0.0001 --fix-BN --checkpoint-iter -1 --quant-scheme uniform8

The commands for other quantization schemes and for other networks are shown in the model zoo.

Inference Acceleration

Experimental Results

Table I and Table II in HAWQ-V3: Dyadic Neural Network Quantization

ResNet18 on ImageNet

Model Quantization Model Size(MB) BOPS(G) Accuracy(%) Inference Speed (batch=8, ms) Download
ResNet18 Floating Points 44.6 1858 71.47 9.7 (1.0x) resnet18_baseline
ResNet18 W8A8 11.1 116 71.56 3.3 (3.0x) resnet18_uniform8
ResNet18 Mixed Precision 6.7 72 70.22 2.7 (3.6x) resnet18_bops0.5
ResNet18 W4A4 5.8 34 68.45 2.2 (4.4x) resnet18_uniform4

ResNet50 on ImageNet

Model Quantization Model Size(MB) BOPS(G) Accuracy(%) Inference Speed (batch=8, ms) Download
ResNet50 Floating Points 97.8 3951 77.72 26.2 (1.0x) resnet50_baseline
ResNet50 W8A8 24.5 247 77.58 8.5 (3.1x) resnet50_uniform8
ResNet50 Mixed Precision 18.7 154 75.39 6.9 (3.8x) resnet50_bops0.5
ResNet50 W4A4 13.1 67 74.24 5.8 (4.5x) resnet50_uniform4

More results for different quantization schemes and different models (also the corresponding commands and important notes) are available in the model zoo.
To download the quantized models through wget, please refer to a simple command in model zoo.
Checkpoints in model zoo are saved in floating point precision. To shrink the memory size, BitPack can be applied on weight_integer tensors, or directly on quantized_checkpoint.pth.tar file.

Related Works

License

HAWQ is released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].