All Projects → gsamaras → Dolphinn

gsamaras / Dolphinn

Licence: other
High Dimensional Approximate Near(est) Neighbor

Programming Languages

C++
36643 projects - #6 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to Dolphinn

lshensemble
LSH index for approximate set containment search
Stars: ✭ 48 (+50%)
Mutual labels:  lsh, nearest-neighbor-search
image-ndd-lsh
Near-duplicate image detection using Locality Sensitive Hashing
Stars: ✭ 42 (+31.25%)
Mutual labels:  lsh
Datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble
Stars: ✭ 1,635 (+5009.38%)
Mutual labels:  lsh
pynanoflann
Unofficial python wrapper to the nanoflann k-d tree
Stars: ✭ 24 (-25%)
Mutual labels:  nearest-neighbor-search
scikit-hubness
A Python package for hubness analysis and high-dimensional data mining
Stars: ✭ 41 (+28.13%)
Mutual labels:  nearest-neighbor-search
awesome-vector-search
Collections of vector search related libraries, service and research papers
Stars: ✭ 460 (+1337.5%)
Mutual labels:  nearest-neighbor-search
Mrpt
Fast and lightweight header-only C++ library (with Python bindings) for approximate nearest neighbor search
Stars: ✭ 210 (+556.25%)
Mutual labels:  nearest-neighbor-search
annoy.rb
annoy-rb provides Ruby bindings for the Annoy (Approximate Nearest Neighbors Oh Yeah).
Stars: ✭ 23 (-28.12%)
Mutual labels:  nearest-neighbor-search
MoTIS
Mobile(iOS) Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP). Accepted at NAACL 2022.
Stars: ✭ 60 (+87.5%)
Mutual labels:  lsh
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-43.75%)
Mutual labels:  lsh
wordvector be
Web服务:使用腾讯 800 万词向量模型和 spotify annoy 引擎得到相似关键词
Stars: ✭ 92 (+187.5%)
Mutual labels:  nearest-neighbor-search
lsh
Locality Sensitive Hashing for Go (Multi-probe LSH, LSH Forest, basic LSH)
Stars: ✭ 92 (+187.5%)
Mutual labels:  lsh
pqtable
Fast search algorithm for product-quantized codes via hash-tables
Stars: ✭ 48 (+50%)
Mutual labels:  nearest-neighbor-search
quick-adc
Quick ADC
Stars: ✭ 20 (-37.5%)
Mutual labels:  nearest-neighbor-search
Rayuela.jl
Code for my PhD thesis. Library of quantization-based methods for fast similarity search in high dimensions. Presented at ECCV 18.
Stars: ✭ 54 (+68.75%)
Mutual labels:  nearest-neighbor-search
minhash-lsh
Minhash LSH in Golang
Stars: ✭ 20 (-37.5%)
Mutual labels:  lsh
lsh-rs
Locality Sensitive Hashing in Rust with Python bindings
Stars: ✭ 64 (+100%)
Mutual labels:  lsh
H2 ALSH
Accurate and Fast ALSH for Maximum Inner Product Search (KDD 2018)
Stars: ✭ 18 (-43.75%)
Mutual labels:  lsh
product-quantization
🙃Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
Stars: ✭ 40 (+25%)
Mutual labels:  lsh
kdtree-rs
K-dimensional tree in Rust for fast geospatial indexing and lookup
Stars: ✭ 137 (+328.13%)
Mutual labels:  nearest-neighbor-search

DOLPHINN

DOLPHINN is a C++ header-only library for: Dimension reductiOn and LookuPs on a Hypercube for effIcient Near Neighbor.

How to use DOLPHINN?

Just include DOLPHINN's header file. src/main.cpp contains a representative example.

Note: If you are interested in Nearest Neighbor, use DolphinnPy.

DOLPHINN is generic yet fast!

On data sets with more than 1 million points in around 128 dimensions, DOLPHINN typically requires only some milliseconds per query.


DOLPHINN provides with a simple, yet efficient method for the problem of computing an (approximate) nearest neighbor in high dimensions. The algorithm is based on our paper: Practical linear-space Approximate Near Neighbors in high dimension[Avarikioti, Prof. Emiris, Psarros (original idea) and Samaras], where we show linear space and sublinear query for a specific setting of parameters. Part of the Data Science Master Thesis of George Samaras, National Kapodistrian University of Athens, 2016.

First, N points are randomly mapped to keys in {0,1}^K, for K<=logN, by making use of the Hypeplane LSH family. Then, for a given query, candidate nearest neighbors are the ones within a small hamming radius with respect to their keys. Our approach resembles the multi-probe LSH approach but it differs on how the list of candidates is computed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].