All Projects → lovasoa → bloomfilter

lovasoa / bloomfilter

Licence: MIT license
Simplistic (but fast) java implementation of a bloom filter.

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to bloomfilter

Bloom Filter Scala
Bloom filter for Scala, the fastest for JVM
Stars: ✭ 333 (+851.43%)
Mutual labels:  datastructures, bloom-filter
fastbloom
A simple but fast bloomfilter written in Python
Stars: ✭ 21 (-40%)
Mutual labels:  bloom-filter, hashes
treebitmap
Fast IP lookup table for IPv4/IPv6 prefixes
Stars: ✭ 81 (+131.43%)
Mutual labels:  datastructures
xorf
Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.
Stars: ✭ 64 (+82.86%)
Mutual labels:  bloom-filter
Algorithms
Data Structures & Algorithms. Includes solutions for Cracking the Coding Interview 6th Edition
Stars: ✭ 89 (+154.29%)
Mutual labels:  datastructures
ganon
ganon classifies short DNA sequences against large sets of genomic sequences efficiently, with download and update of references (RefSeq/Genbank), taxonomic (NCBI/GTDB) and hierarchical classification, customized reporting and more
Stars: ✭ 57 (+62.86%)
Mutual labels:  bloom-filter
bloom
An in-memory bloom filter with persistence and HTTP interface
Stars: ✭ 31 (-11.43%)
Mutual labels:  bloom-filter
pybloomfiltermmap3
Fast Python Bloom Filter using Mmap
Stars: ✭ 87 (+148.57%)
Mutual labels:  bloom-filter
borax
📓 Python3工具集合库——中国农历/中文数字/设计模式/树形结构
Stars: ✭ 57 (+62.86%)
Mutual labels:  datastructures
bloomclj
A Bloom Filter implementation in Clojure
Stars: ✭ 20 (-42.86%)
Mutual labels:  bloom-filter
Python Scripts
It contains all the Python Programs, whether it's a GUI, basic, Data Structures, etc. It's a collection of some great Python scripts from basic to advance levels for automating some monotonous tasks.
Stars: ✭ 23 (-34.29%)
Mutual labels:  datastructures
rust-bloomfilter
🦀 Bloom filter implementation in Rust 🦀
Stars: ✭ 18 (-48.57%)
Mutual labels:  bloom-filter
Programacion-Competitiva
Material de programación competitiva elaborado por el Grupo de Programación Competitiva UNI (GPC-UNI)
Stars: ✭ 23 (-34.29%)
Mutual labels:  datastructures
leaked-password
Leaked password check library with bloom filter
Stars: ✭ 41 (+17.14%)
Mutual labels:  bloom-filter
Data-Structures-and-Algorithms--A-Comprehensive-Guide
Data Structures & Algorithms - A Comprehensive Guide
Stars: ✭ 15 (-57.14%)
Mutual labels:  datastructures
AnonCracker
A single tool to bruteforce pdf , zip and hashes very super fast tool developed with python3
Stars: ✭ 36 (+2.86%)
Mutual labels:  hashes
icpc
Resources for Competitive Programming
Stars: ✭ 14 (-60%)
Mutual labels:  datastructures
cracken
a fast password wordlist generator, Smartlist creation and password hybrid-mask analysis tool written in pure safe Rust
Stars: ✭ 192 (+448.57%)
Mutual labels:  hashes
AlgorithmSet
LeetCode 算法练习集合 ~ 每天一道算法题
Stars: ✭ 19 (-45.71%)
Mutual labels:  datastructures
datastructures-and-algorithms
Repository for studying/practicing Data Structures, Algorithms and Code Interview Problems.
Stars: ✭ 30 (-14.29%)
Mutual labels:  datastructures

BloomFilter

Build Status Jitpack repository

A Bloom filter is a hashing based data structure for maintaining a set of items in limited memory, allowing false positives but no false negatives.

This repository contains a simple but performant implementation of bloom filters in java.

See the Wikipedia page for Bloom filters.

How to use

Add the project to your dependencies

You can add this project to your dependencies very easily using jitpack.io. See instructions.

Usage example

import com.github.lovasoa.bloomfilter.BloomFilter;

// Create a new bloom filter optimized for containing 100 elements and using 1024 bits of memory
BloomFilter f = new BloomFilter(100, 1024);

// Add elements to the filter
// it uses Object.hashCode() internally, so you can add objects of any type
f.add("hello");

// Check if an element is in the filter
f.contains("hello"); // true
f.contains("hello, world!"); // false

Documentation

Read the full javadoc of the BloomFilter class.

Implementation details

Hashes

It doesn't use any fancy hash function. It uses object.hashCode() instead. You can override your objects' .hashCodemethod if you want better hashes.

No allocation

It doesn't do any allocation when adding new elements or checking if an element is present. It should thus be faster than many other implementations.

Performance

A class is provided that helps making performance measurements: Test.java. It tests the implementation with a Bloom filter containing randomly generated integers.

Here are the results it gives on my laptop (Core i7-4600M CPU @ 2.90GHz) with a set of 10 million integers added to a 10 megabyte Bloom filter:

Testing a bloom filter containing n=10000000 elements in a bit array of m=80000000 bits (=9.5Mib) 

Testing correctness.
Creating a filter, a set, and filling them...
Elements incorrectly found to be inside:   215013/10000000 (2.15%)
done.

Testing insertion speed...
Inserted 10000000 elements in 3445388006 ns.
Insertion speed: 2.90243e+06 elements/second

Testing query speed...
Queried 10000000 elements in 1537504033 ns.
Query speed: 6.50405e+06 elements/second

We see that:

  • The implementation is correct: the error rate is p=exp(-ln(2)^2 * m/n)
  • It is quite fast
    • It can insert around 2 million elements per second.
    • It can query around 6 million elements per second.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].