All Projects → una-dinosauria → local-search-quantization

una-dinosauria / local-search-quantization

Licence: MIT license
State-of-the-art method for large-scale ANN search as of Oct 2016. Presented at ECCV 16.

Programming Languages

julia
2034 projects
Cuda
1817 projects
C++
36643 projects - #6 most used programming language
matlab
3953 projects

Projects that are alternatives of or similar to local-search-quantization

Terngrad
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
Stars: ✭ 168 (+140%)
Mutual labels:  quantization
Haq
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Stars: ✭ 247 (+252.86%)
Mutual labels:  quantization
BitPack
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
Stars: ✭ 36 (-48.57%)
Mutual labels:  quantization
Pytorch Playground
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
Stars: ✭ 2,201 (+3044.29%)
Mutual labels:  quantization
Nlp Architect
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Stars: ✭ 2,768 (+3854.29%)
Mutual labels:  quantization
MQBench Quantize
QAT(quantize aware training) for classification with MQBench
Stars: ✭ 29 (-58.57%)
Mutual labels:  quantization
Model compression
PyTorch Model Compression
Stars: ✭ 150 (+114.29%)
Mutual labels:  quantization
torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Stars: ✭ 126 (+80%)
Mutual labels:  quantization
Blueoil
Bring Deep Learning to small devices
Stars: ✭ 244 (+248.57%)
Mutual labels:  quantization
deepvac
PyTorch Project Specification.
Stars: ✭ 507 (+624.29%)
Mutual labels:  quantization
Lq Nets
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Stars: ✭ 195 (+178.57%)
Mutual labels:  quantization
Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (+218.57%)
Mutual labels:  quantization
ZAQ-code
CVPR 2021 : Zero-shot Adversarial Quantization (ZAQ)
Stars: ✭ 59 (-15.71%)
Mutual labels:  quantization
Kd lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Stars: ✭ 173 (+147.14%)
Mutual labels:  quantization
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Stars: ✭ 264 (+277.14%)
Mutual labels:  quantization
Awesome Ml Model Compression
Awesome machine learning model compression research papers, tools, and learning material.
Stars: ✭ 166 (+137.14%)
Mutual labels:  quantization
TF2DeepFloorplan
TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.
Stars: ✭ 98 (+40%)
Mutual labels:  quantization
mmn
Moore Machine Networks (MMN): Learning Finite-State Representations of Recurrent Policy Networks
Stars: ✭ 39 (-44.29%)
Mutual labels:  quantization
Neural-Network-Compression
Paper list for neural network compression techniques
Stars: ✭ 31 (-55.71%)
Mutual labels:  quantization
fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Stars: ✭ 421 (+501.43%)
Mutual labels:  quantization

Local-search Quantization

This is the code for the papers

The code in this repository was mostly written by Julieta Martinez and Joris Clement.

Dependencies

Our code is mostly written in Julia, and should run under version 0.6 or later. To get Julia, go to the Julia downloads page and install the latest stable release.

We use a number of dependencies that you have to install using Pkg.install( "package_name" ), where package_name is

To run encoding in a GPU, you have to compile Julia from source (I know this sucks! but it will no longer be necessary with Julia 1.0). You will also need to install

  • CUDAdrv -- the CUDA driver API
  • CUBLAS -- for fast matrix multiplication in the GPU
  • A CUDA-enabled GPU with compute capability 3.5 or higher. We have tested our code on K40 and Titan X GPUs

Finally, to run the sparse encoding demo you will need Matlab to run the SPGL1 solver by van den Berg and Friedlander, as well as the MATLAB.jl package to call Matlab functions from Julia.

Demos

First, clone this repository and download the SIFT1M dataset. To do so run the following commands:

git clone [email protected]:jltmtz/local-search-quantization.git
cd local-search-quantization
mkdir data
cd data
wget ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz
tar -xvzf sift.tar.gz
rm sift.tar.gz
cd ..

Also, compile the auxiliary search cpp code:

cd src/linscan/cpp/
./compile.sh
cd ../../../

For expedience, the following demos train on the first 10K vectors of the SIFT1M dataset. To reproduce the paper results you will have to use the full training set with 100K vectors.

There are 3 main functionalities showcased in this code:

1) Baselines and LSQ demo with encoding in the CPU

Simply run

julia demos/demo_pq.jl
julia demos/demo_opq.jl
julia demos/demo_lsq.jl

This will train PQ, OPQ, and LSQ on a subset of SIFT1M, encode the base set and compute a recall@N curve. To get better speed in LSQ, you can also run the code on parallel in multiple cores using

julia -p n demos/demo_lsq.jl

Where n is the number of CPU cores on your machine.

2) LSQ demo with encoding in the GPU

If you have a CUDA-enabled GPU, you might want to try out encoding in the GPU.

First, compile the CUDA code:

cd src/encodings/cuda
./compile.sh
cd ../../../

and then run

julia demos/demo_lsq_gpu.jl

or

julia -p n demos/demo_lsq_gpu.jl

Where n is the number of CPU cores on your machine.

3) LSQ demo with sparse encoding

This is very similar to demo #1, but the learned codebooks will be sparse.

First of all, you have to download the SPGL1 solver by van den Berg and Friedlander, and add the function that implements Expression 8 to the package

cd matlab
git clone [email protected]:mpf/spgl1.git
mv sparse_lsq_fun.m spgl1/
mv splitarray.m spgl1/
cd ..

Now you should be able to run the demo

julia -p n demos/demo_lsq_sparse.jl

Where n is the number of CPU cores on your machine.

Note that you need MATLAB installed on your computer to run this demo, as well as well as the MATLAB.jl package to call Matlab functions from Julia. Granted, getting all this to work can be a bit of a pain -- if at this point you (like me) love Julia more than any other language, please consider porting SPGL1 to Julia.

Citing

Thank for your interest in our research! If you find this code useful, please consider citing our paper

Julieta Martinez, Joris Clement, Holger H. Hoos, James J. Little. "Revisiting
additive quantization", ECCV 2016.

If you use our GPU implementation please consider citing

Julieta Martinez, Holger H. Hoos, James J. Little. "Solving multi-codebook
quantization in the GPU", 4th Workshop on Web-scale Vision and Social Media
(VSM), at ECCV 2016.

FAQ

  • Q: What is ChainQ?

    A: ChainQ is a quantization method inspired by optimized tree quantization (OTQ). Instead of learning the dimension splitting and sharing among codebooks (which OTQ finds using Gurobi), we simply take the natural splitting and sharing given by contiguous dimensions. Therefore, our codebooks form a chain, not a general tree. This means we can solve encoding optimally using the Viterbi algorithm.

  • Q: LSQ is very slow...?

    A: Compared to PQ and OPQ yes, but (a) it gives much better compression rates, and (b) it is much better in quality and speed compared to additive quantization (AQ) (our most similar baseline). The authors have made the AQ code available, so you can compare yourself :)

  • Q: The code does not reproduce the results of the paper...?

    A: The demos train on 10K vectors and for 10 iterations. To reproduce the results of the paper, train with the whole 100K vectors and do it for 100 iterations. You can also control the number of ILS iterations to use for database encoding in the LSQ demos; which corresponds to LSQ-16 and LSQ-32 in the paper.

  • Q: Why do I see all those warnings when I run your code?

    A: Julia 0.5 issues a warning when a method is redefined more than once in the Main scope. This is annoying for many people and will disappear in Julia 0.6 (see JuliaLang/julia#18725)

Acknowledgments

Some of our evaluation code and our OPQ implementation has been adapted from Cartesian k-means by Mohamad Norouzi and optimized product quantization by Kaiming He.

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].