All Projects → mathetake → Gann

mathetake / Gann

Licence: mit
gann(go-approximate-nearest-neighbor) is a library for Approximate Nearest Neighbor Search written in Go

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Gann

docarray
The data structure for unstructured data
Stars: ✭ 561 (+648%)
Mutual labels:  nearest-neighbor-search
Soundfingerprinting
Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.
Stars: ✭ 554 (+638.67%)
Mutual labels:  nearest-neighbor-search
Deep Mihash
Code for papers "Hashing with Mutual Information" (TPAMI 2019) and "Hashing with Binary Matrix Pursuit" (ECCV 2018)
Stars: ✭ 13 (-82.67%)
Mutual labels:  nearest-neighbor-search
pgvector
Open-source vector similarity search for Postgres
Stars: ✭ 482 (+542.67%)
Mutual labels:  nearest-neighbor-search
N2
TOROS N2 - lightweight approximate Nearest Neighbor library which runs fast even with large datasets
Stars: ✭ 457 (+509.33%)
Mutual labels:  nearest-neighbor-search
Milvus
An open-source vector database for embedding similarity search and AI applications.
Stars: ✭ 9,015 (+11920%)
Mutual labels:  nearest-neighbor-search
kdtree
A k-d tree implementation in Go.
Stars: ✭ 98 (+30.67%)
Mutual labels:  nearest-neighbor-search
Ggnn
GGNN: State of the Art Graph-based GPU Nearest Neighbor Search
Stars: ✭ 63 (-16%)
Mutual labels:  nearest-neighbor-search
Lopq
Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+606.67%)
Mutual labels:  nearest-neighbor-search
Falconn
FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)
Stars: ✭ 919 (+1125.33%)
Mutual labels:  nearest-neighbor-search
lbvh
an implementation of parallel linear BVH (LBVH) on GPU
Stars: ✭ 67 (-10.67%)
Mutual labels:  nearest-neighbor-search
Pynndescent
A Python nearest neighbor descent for approximate nearest neighbors
Stars: ✭ 377 (+402.67%)
Mutual labels:  nearest-neighbor-search
Ngt
Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Stars: ✭ 636 (+748%)
Mutual labels:  nearest-neighbor-search
adventures-with-ann
All the code for a series of Medium articles on Approximate Nearest Neighbors
Stars: ✭ 40 (-46.67%)
Mutual labels:  nearest-neighbor-search
Fast Near Duplicate Image Search
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Stars: ✭ 54 (-28%)
Mutual labels:  nearest-neighbor-search
graphgrove
A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search
Stars: ✭ 29 (-61.33%)
Mutual labels:  nearest-neighbor-search
Smile
Statistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+7116%)
Mutual labels:  nearest-neighbor-search
Annoy
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Stars: ✭ 9,262 (+12249.33%)
Mutual labels:  nearest-neighbor-search
Awesome Cbir Papers
📝Awesome and classical image retrieval papers
Stars: ✭ 1,114 (+1385.33%)
Mutual labels:  nearest-neighbor-search
Pointcloudutilities
Utilities for point cloud processing. read ply, write ply, search nearest neighbors using octree ...
Stars: ✭ 17 (-77.33%)
Mutual labels:  nearest-neighbor-search

gann

CircleCI MIT License

portfolio_view

gann (go-approximate-nearest-neighbor) is a library for approximate nearest neighbor search purely written in golang.

The implemented algorithm is truly inspired by Annoy (https://github.com/spotify/annoy).

feature

  1. purely written in Go: no dependencies out of Go world.
  2. easy to tune with a bit of parameters

installation

go get github.com/mathetake/gann

parameters

setup phase parameters

name type description run-time complexity space complexity accuracy
dim int dimension of target vectors the larger, the more expensive the larger, the more expensive N/A
nTree int # of trees the larger, the more expensive the larger, the more expensive the larger, the more accurate
k int maximum # of items in a single leaf the larger, the less expensive N/A the larger, the less accurate

runtime (search phase) parameters

name type description time complexity accuracy
searchNum int # of requested neighbors the larger, the more expensive N/A
bucketScale float64 affects the size of bucket the larger, the more expensive the larger, the more accurate

bucketScale affects the size of bucket which consists of items for exact distance calculation. The actual size of the bucket is calculated by int(searchNum * bucketScale).

In the search phase, we traverse index trees and continuously put items on reached leaves to the bucket until the bucket becomes full. Then we calculate the exact distances between a item in the bucket and the query vector to get approximate nearest neighbors.

Therefore, the larger bucketScale, the more computational complexity while the more accurate result to be produced.

example

package main

import (
	"fmt"
	"math/rand"
	"time"

	"github.com/mathetake/gann"
	"github.com/mathetake/gann/metric"
)

var (
	dim    = 3
	nTrees = 2
	k      = 10
	nItem  = 1000
)

func main() {
	rawItems := make([][]float64, 0, nItem)
	rand.Seed(time.Now().UnixNano())

	for i := 0; i < nItem; i++ {
		item := make([]float64, 0, dim)
		for j := 0; j < dim; j++ {
			item = append(item, rand.Float64())
		}
		rawItems = append(rawItems, item)
	}

	m, err := metric.NewCosineMetric(dim)
	if err != nil {
		// err handling
		return
	}

	// create index
	idx, err := gann.CreateNewIndex(rawItems, dim, nTrees, k, m)
	if err != nil {
		// error handling
		return
	}

	// search
	var searchNum = 5
	var bucketScale float64 = 10
	q := []float64{0.1, 0.02, 0.001}
	res, err := idx.GetANNbyVector(q, searchNum, bucketScale)
	if err != nil {
		// error handling
		return
	}

	fmt.Printf("res: %v\n", res)
}

references

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].