All Projects → steakknife → Bloomfilter

steakknife / Bloomfilter

Licence: mit
Face-meltingly fast, thread-safe, marshalable, unionable, probability- and optimal-size-calculating Bloom filter in go

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Bloomfilter

Algorithms
My Algorithms and Data Structures studies. https://leandrotk.github.io/series/algorithms-problem-solving
Stars: ✭ 275 (-12.7%)
Mutual labels:  algorithms
Simdcompressionandintersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
Stars: ✭ 289 (-8.25%)
Mutual labels:  algorithms
Gdrl
Grokking Deep Reinforcement Learning
Stars: ✭ 304 (-3.49%)
Mutual labels:  algorithms
Tspvis
🗺️ Visualize and control algorithms for the traveling salesman problem
Stars: ✭ 279 (-11.43%)
Mutual labels:  algorithms
Tryalgo
Algorithms and data structures for preparing programming competitions: basic and advanced
Stars: ✭ 288 (-8.57%)
Mutual labels:  algorithms
R
All Algorithms implemented in R
Stars: ✭ 294 (-6.67%)
Mutual labels:  algorithms
Javascript Patterns
A collection of javascript algorithms, patterns, and techniques
Stars: ✭ 273 (-13.33%)
Mutual labels:  algorithms
450 Dsa
450-DSA helps you track your progress in solving 400+ DSA questions and keeps you engaging based on DSA-Cracker Sheet ⚡
Stars: ✭ 301 (-4.44%)
Mutual labels:  algorithms
Awesome Golang Algorithm
📝 LeetCode of algorithms with golang solution(updating).
Stars: ✭ 3,217 (+921.27%)
Mutual labels:  algorithms
Machinelearning Samples
Samples for ML.NET, an open source and cross-platform machine learning framework for .NET.
Stars: ✭ 3,445 (+993.65%)
Mutual labels:  algorithms
Dart
Stars: ✭ 278 (-11.75%)
Mutual labels:  algorithms
Leetcode Go
✅ Solutions to LeetCode by Go, 100% test coverage, runtime beats 100% / LeetCode 题解
Stars: ✭ 22,440 (+7023.81%)
Mutual labels:  algorithms
Neo4j Nlp
NLP Capabilities in Neo4j
Stars: ✭ 299 (-5.08%)
Mutual labels:  algorithms
Grokkingalgorithms
java samples of Grokking Algorithm book by Aditya Y. Bhargava
Stars: ✭ 278 (-11.75%)
Mutual labels:  algorithms
Play With Algorithms
Codes of my MOOC Course <Play with Algorithms>, Both in C++ and Java language. Updated contents and practices are also included. 我在慕课网上的课程《算法与数据结构》示例代码,包括C++和Java版本。课程的更多更新内容及辅助练习也将逐步添加进这个代码仓。
Stars: ✭ 3,309 (+950.48%)
Mutual labels:  algorithms
Codinginterviews
This repository contains coding interviews that I have encountered in company interviews
Stars: ✭ 2,881 (+814.6%)
Mutual labels:  algorithms
Algorithms Visualiser
Algorithms Visualiser is an opensource project made using ReactJS. Visualise Algorithms on Sorting, Pathfinding, Searching, Word Search, Backtracking.
Stars: ✭ 290 (-7.94%)
Mutual labels:  algorithms
Baekjoon
코딩테스트 대비 문제집(Baekjoon Online Judge)
Stars: ✭ 295 (-6.35%)
Mutual labels:  algorithms
Swift Algorithm Club Cn
swift-algorithm-club的翻译。使用Swift学习算法和数据结构。
Stars: ✭ 304 (-3.49%)
Mutual labels:  algorithms
Daily Coding Problem
Solutions for Daily Coding Problem.
Stars: ✭ 300 (-4.76%)
Mutual labels:  algorithms

Important: Zeroth, consider if a Cuckoo filter could be right for your use-case.

GoDoc travis

Face-meltingly fast, thread-safe, marshalable, unionable, probability- and optimal-size-calculating Bloom filter in go

Copyright © 2014-2016,2018 Barry Allard

MIT license

WTF is a bloom filter

**TL;DR: **Probabilistic, extra lookup table to track a set of elements kept elsewhere to reduce expensive, unnecessary set element retrieval and/or iterator operations when an element is not present in the set. It's a classic time-storage tradeoff algoritm.

Properties

See wikipedia for algorithm details

Impact What Description
Good No false negatives know for certain if a given element is definitely NOT in the set
Bad False positives uncertain if a given element is in the set
Bad Theoretical potential for hash collisions in very large systems and/or badly hash.Hash64-conforming implementations
Bad Add only Cannot remove an element, it would destroy information about other elements
Good Constant storage uses only a fixed amount of memory

Naming conventions

(Similar to algorithm)

Variable/function Description Range
m/M() number of bits in the bloom filter (memory representation is about m/8 bytes in size) >=2
n/N() number of elements present >=0
k/K() number of keys to use (keys are kept private to user code but are de/serialized to Marshal and file I/O) >=0
maxN maximum capacity of intended structure >0
p maximum allowed probability of collision (for computing m and k for optimal sizing) >0..<1
  • Memory representation should be exactly 24 + 8*(k + (m+63)/64) + unsafe.Sizeof(RWMutex) bytes.
  • Serialized (BinaryMarshaler) representation should be exactly 72 + 8*(k + (m+63)/64) bytes. (Disk format is less due to compression.)

Binary serialization format

All values in Little-endian format

Offset Offset (Hex) Length (bytes) Name Type
0 00 8 k uint64
8 08 8 n uint64
16 10 8 m uint64
24 18 k (keys) [k]uint64
24+8*k ... (m+63)/64 (bloom filter) [(m+63)/64]uint64
24+8*k+8*((m+63)/64) ... 48 (SHA384 of all previous fields, hashed in order) [48]byte
  • bloomfilter.Filter conforms to encoding.BinaryMarshaler and `encoding.BinaryUnmarshaler'

Usage


import "github.com/steakknife/bloomfilter"

const (
  maxElements = 100000
  probCollide = 0.0000001
)

bf, err := bloomfilter.NewOptimal(maxElements, probCollide)
if err != nil {
  panic(err)
}

someValue := ... // must conform to hash.Hash64

bf.Add(someValue)
if bf.Contains(someValue) { // probably true, could be false
  // whatever
}

anotherValue := ... // must also conform to hash.Hash64

if bf.Contains(anotherValue) {
  panic("This should never happen")
}

err := bf.WriteFile("1.bf.gz")  // saves this BF to a file
if err != nil {
  panic(err)
}

bf2, err := bloomfilter.ReadFile("1.bf.gz") // read the BF to another var
if err != nil {
  panic(err)
}

Design

Where possible, branch-free operations are used to avoid deep pipeline / execution unit stalls on branch-misses.

Get

go get -u github.com/steakknife/bloomfilter  # master is always stable

Source

Contact

License

MIT license

Copyright © 2014-2016 Barry Allard

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].