All Projects â†’ xinyandai â†’ product-quantization

xinyandai / product-quantization

Licence: other
🙃Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to product-quantization

pqlite
âš¡ A fast embedded library for approximate nearest neighbor search
Stars: ✭ 141 (+252.5%)
Mutual labels:  approximate-nearest-neighbor-search, product-quantization, vector-quantization
elasticsearch-approximate-nearest-neighbor
Plugin to integrate approximate nearest neighbor(ANN) search with Elasticsearch
Stars: ✭ 53 (+32.5%)
Mutual labels:  approximate-nearest-neighbor-search, product-quantization
lshensemble
LSH index for approximate set containment search
Stars: ✭ 48 (+20%)
Mutual labels:  lsh, approximate-nearest-neighbor-search
acoustic-keylogger
Pipeline of a keylogging attack using just an audio signal and unsupervised learning.
Stars: ✭ 80 (+100%)
Mutual labels:  clustering
consul-cluster-manager
Consul - based cluster manager that can be plugged into Vert.x ecosystem.
Stars: ✭ 17 (-57.5%)
Mutual labels:  clustering
image-ndd-lsh
Near-duplicate image detection using Locality Sensitive Hashing
Stars: ✭ 42 (+5%)
Mutual labels:  lsh
JPQ
CIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.
Stars: ✭ 39 (-2.5%)
Mutual labels:  product-quantization
Leaflet.MarkerCluster.LayerSupport
Sub-plugin for Leaflet.markercluster plugin; brings compatibility with Layers Control and other plugins
Stars: ✭ 53 (+32.5%)
Mutual labels:  clustering
point-cloud-clusters
A catkin workspace in ROS which uses DBSCAN to identify which points in a point cloud belong to the same object.
Stars: ✭ 43 (+7.5%)
Mutual labels:  clustering
autoplait
Python implementation of AutoPlait (SIGMOD'14) without smoothing algorithm. NOTE: This repository is for my personal use.
Stars: ✭ 24 (-40%)
Mutual labels:  clustering
permute-quantize-finetune
Using ideas from product quantization for state-of-the-art neural network compression.
Stars: ✭ 131 (+227.5%)
Mutual labels:  vector-quantization
opensvc
The OpenSVC node agent
Stars: ✭ 27 (-32.5%)
Mutual labels:  clustering
Deep-multimodal-subspace-clustering-networks
Tensorflow implementation of "Deep Multimodal Subspace Clustering Networks"
Stars: ✭ 62 (+55%)
Mutual labels:  clustering
dtw-python
Python port of R's Comprehensive Dynamic Time Warp algorithms package
Stars: ✭ 139 (+247.5%)
Mutual labels:  clustering
GrouProx
FedGroup, A Clustered Federated Learning framework based on Tensorflow
Stars: ✭ 20 (-50%)
Mutual labels:  clustering
Machine-learning
This repository will contain all the stuffs required for beginners in ML and DL do follow and star this repo for regular updates
Stars: ✭ 27 (-32.5%)
Mutual labels:  clustering
react-map-gl-cluster
Urbica React Cluster Component for Mapbox GL JS
Stars: ✭ 27 (-32.5%)
Mutual labels:  clustering
inet ssh dist
SSH distribution for erlang
Stars: ✭ 46 (+15%)
Mutual labels:  clustering
topometry
A comprehensive dimensional reduction framework to recover the latent topology from high-dimensional data.
Stars: ✭ 64 (+60%)
Mutual labels:  clustering
pyclustertend
A python package to assess cluster tendency
Stars: ✭ 38 (-5%)
Mutual labels:  clustering

product-quantization

A general framework of vector quantization with python.

NEQ, AAAI 2020, Oral

Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.

  • Abstract

    Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) --- a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.

Datasets

The netflix dataset is contained in this repository, you can download more datasets from here, then you can calculate the ground truth with the script

python run_ground_truth.py  --dataset netflix --topk 50 --metric product

Run examples

python run_pq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_opq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_rq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_aq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256 # very slow

Reproduce results of NEQ

python run_norm_pq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_norm_opq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_norm_rq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256
python run_norm_aq.py --dataset netflix --topk 20 --metric product --num_codebook 4 --Ks 256 # very slow

Reference

If you use this code, please cite the following paper

@article{xinyandai,
  title={Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search},
  author={Dai, Xinyan and Yan, Xiao and Ng, Kelvin KW and Liu, Jie and Cheng, James},
  journal={arXiv preprint arXiv:1911.04654},
  year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].