Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → mavam → Libbf

mavam / Libbf

Licence: bsd-3-clause

🎯 Bloom filters for C++11

Labels

bloom-filter

Projects that are alternatives of or similar to Libbf

ganon

ganon classifies short DNA sequences against large sets of genomic sequences efficiently, with download and update of references (RefSeq/Genbank), taxonomic (NCBI/GTDB) and hierarchical classification, customized reporting and more

Stars: ✭ 57 (-80.87%)

Mutual labels: bloom-filter

json-bloomfilter

🗜 A bloom filter implementation in Ruby and Javascript that is serialisable to JSON and compatible between both languages.

Stars: ✭ 15 (-94.97%)

Mutual labels: bloom-filter

fastbloom

A simple but fast bloomfilter written in Python

Stars: ✭ 21 (-92.95%)

Mutual labels: bloom-filter

bloomclj

A Bloom Filter implementation in Clojure

Stars: ✭ 20 (-93.29%)

Mutual labels: bloom-filter

bloomfilter

Simplistic (but fast) java implementation of a bloom filter.

Stars: ✭ 35 (-88.26%)

Mutual labels: bloom-filter

guava-probably

Probabilistic data structures for Guava.

Stars: ✭ 51 (-82.89%)

Mutual labels: bloom-filter

PharoPDS

Probabilistic data structures in Pharo Smalltalk.

Stars: ✭ 28 (-90.6%)

Mutual labels: bloom-filter

Redis Lua Scaling Bloom Filter

LUA Redis scripts for a scaling bloom filter

Stars: ✭ 268 (-10.07%)

Mutual labels: bloom-filter

komihash

Very fast, high-quality hash function (non-cryptographic, C) + PRNG

Stars: ✭ 68 (-77.18%)

Mutual labels: bloom-filter

bloomfilter

Bloomfilter written in Golang, includes rotation and RPC

Stars: ✭ 61 (-79.53%)

Mutual labels: bloom-filter

bloom

An in-memory bloom filter with persistence and HTTP interface

Stars: ✭ 31 (-89.6%)

Mutual labels: bloom-filter

xorf

Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.

Stars: ✭ 64 (-78.52%)

Mutual labels: bloom-filter

ntHash

Fast hash function for DNA sequences

Stars: ✭ 66 (-77.85%)

Mutual labels: bloom-filter

rust-bloomfilter

🦀 Bloom filter implementation in Rust 🦀

Stars: ✭ 18 (-93.96%)

Mutual labels: bloom-filter

algorithm coding

推荐算法、相似度算法、布隆过滤器、均值算法、一致性Hash、数据结构、leetcode练习

Stars: ✭ 30 (-89.93%)

Mutual labels: bloom-filter

pybloomfiltermmap3

Fast Python Bloom Filter using Mmap

Stars: ✭ 87 (-70.81%)

Mutual labels: bloom-filter

golomb-set

A Golomb Coded Set implementation in Rust

Stars: ✭ 33 (-88.93%)

Mutual labels: bloom-filter

Hash Table

Fast, reliable cuckoo hash table for Node.js.

Stars: ✭ 272 (-8.72%)

Mutual labels: bloom-filter

algorithm-structure

2021年最新总结 500个常用数据结构，算法，算法导论，面试常用，大厂高级工程师整理总结

Stars: ✭ 590 (+97.99%)

Mutual labels: bloom-filter

hashlookup-forensic-analyser

Analyse a forensic target (such as a directory) to find and report files found and not found from CIRCL hashlookup public service - https://circl.lu/services/hashlookup/

Stars: ✭ 43 (-85.57%)

Mutual labels: bloom-filter

View All Similar Projects ➔

libbf is a C++11 library which implements various Bloom filters, including:

Basic
Counting
Spectral MI
Spectral RM
Bitwise
A^2
Stable

Synopsis

#include <iostream>
#include <bf.h>

int main()
{
  bf::basic_bloom_filter b(0.8, 100);

  // Add two elements.
  b.add("foo");
  b.add(42);

  // Test set membership
  std::cout << b.lookup("foo") << std::endl;  // 1
  std::cout << b.lookup("bar") << std::endl;  // 0
  std::cout << b.lookup(42) << std::endl;     // 1

  // Remove all elements.
  b.clear();
  std::cout << b.lookup("foo") << std::endl;  // 0
  std::cout << b.lookup(42) << std::endl;     // 0

  return 0;
}

Requirements

A C++11 compiler (GCC >= 4.7 or Clang >= 3.2)
CMake (>= 2.8)

Installation

The build process uses CMake, wrapped in autotools-like scripts. The configure script honors the CXX environment variable to select a specific C++compiler. For example, the following steps compile libbf with Clang and install it under PREFIX:

CXX=clang++ ./configure --prefix=PREFIX
make
make test
make install

Documentation

The most recent version of the Doxygen API documentation exists at http://mavam.github.io/libbf/api. Alternatively, you can build the documentation locally via make doc and then browse to doc/gh-pages/api/index.html.

Usage

After having installed libbf, you can use it in your application by including the header file bf.h and linking against the library. All data structures reside in the namespace bf and the following examples assume:

using namespace bf;

Each Bloom filter inherits from the abstract base class bloom_filter, which provides addition and lookup via the virtual functions add and lookup. These functions take an object as argument, which serves a light-weight view over sequential data for hashing.

For example, if you can create a basic Bloom filter with a desired false-positive probability and capacity as follows:

// Construction.
bloom_filter* bf = new basic_bloom_filter(0.8, 100);

// Addition.
bf->add("foo");
bf->add(42);

// Lookup.
assert(bf->lookup("foo") == 1);
assert(bf->lookup(42) == 1);

// Remove all elements from the Bloom filter.
bf->clear();

In this case, libbf computes the optimal number of hash functions needed to achieve the desired false-positive rate which holds until the capacity has been reached (80% and 100 distinct elements, in the above example). Alternatively, you can construct a basic Bloom filter by specifying the number of hash functions and the number of cells in the underlying bit vector:

bloom_filter* bf = new basic_bloom_filter(make_hasher(3), 1024);

Since not all Bloom filter implementations come with closed-form solutions based on false-positive probabilities, most constructors use this latter form of explicit resource provisioning.

In the above example, the free function make_hasher constructs a hasher-an abstraction for hashing objects k times. There exist currently two different hasher, a default_hasher and a double_hasher. The latter uses a linear combination of two pairwise-independent, universal hash functions to produce the k digests, whereas the former merely hashes the object k times.

Evaluation

libbf also ships with a small Bloom filter tool bf in the test directory. This program supports evaluation the accuracy of the different Bloom filter flavors with respect to their false-positive and false-negative rates. Have a look at the console help (-h or --help) for detailed usage instructions.

The tool operates in two phases:

Read input from a file and insert it into a Bloom filter
Query the Bloom filter and compare the result to the ground truth

For example, consider the following input file:

foo
bar
baz
baz
foo

From this input file, you can generate the real ground truth file as follows:

sort input.txt | uniq -c | tee query.txt
   1 bar
   2 baz
   2 foo

The tool bf will compute false-positive and false-negative counts for each element, based on the ground truth given. In the case of a simple counting Bloom filter, an invocation may look like this:

bf -t counting -m 2 -k 3 -i input.txt -q query.txt | column -t

Yielding the following output:

TN  TP  FP  FN  G  C  E
0   1   0   0   1  1  bar
0   1   0   1   2  1  baz
0   1   0   2   2  1  foo

The column headings denote true negatives (TN), true positives (TP), false positives (FP), false negatives (FN), ground truth count (G), actual count (C), and the queried element. The counts are cumulative to support incremental evaluation.

Versioning

We follow Semantic Versioning. The version X.Y.Z indicates:

X is the major version (backward-incompatible),
Y is the minor version (backward-compatible), and
Z is the patch version (backward-compatible bug fix).

License

libbf comes with a BSD-style license (see COPYING for details).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 298

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (15) 🔗