All Projects → gyson → blex

gyson / blex

Licence: MIT license
Fast Bloom filter with concurrent accessibility, powered by :atomics module.

Programming Languages

elixir
2628 projects

Projects that are alternatives of or similar to blex

bloom
An in-memory bloom filter with persistence and HTTP interface
Stars: ✭ 31 (-8.82%)
Mutual labels:  bloom, bloom-filter, probabilistic-data-structures
bloomfilter
Bloomfilter written in Golang, includes rotation and RPC
Stars: ✭ 61 (+79.41%)
Mutual labels:  bloom-filter, probabilistic-data-structures
exor filter
Erlang nif for xor_filter. 'Faster and Smaller Than Bloom and Cuckoo Filters'.
Stars: ✭ 29 (-14.71%)
Mutual labels:  bloom-filter, probabilistic-data-structures
rust-bloomfilter
🦀 Bloom filter implementation in Rust 🦀
Stars: ✭ 18 (-47.06%)
Mutual labels:  bloom-filter, probabilistic-data-structures
Cfilter
Cuckoo Filter implementation in Go, better than Bloom Filters (unmaintained)
Stars: ✭ 772 (+2170.59%)
Mutual labels:  filter, bloom-filter
Boomfilters
Probabilistic data structures for processing continuous, unbounded streams.
Stars: ✭ 1,333 (+3820.59%)
Mutual labels:  filter, bloom-filter
raptor
A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences.
Stars: ✭ 37 (+8.82%)
Mutual labels:  filter, bloom-filter
Elements-of-Functional-Programming-in-Python
Learn how to how to use the lambda, map, filter and reduce functions in Python to transform data structures.
Stars: ✭ 14 (-58.82%)
Mutual labels:  filter
scotty
java proxy
Stars: ✭ 44 (+29.41%)
Mutual labels:  filter
modjpeg-nginx
NGINX filter module for adding overlays and logos to JPEGs on-the-fly without degrading the quality of the image.
Stars: ✭ 18 (-47.06%)
Mutual labels:  filter
rtss
Relative TimeStamps for Stuff
Stars: ✭ 42 (+23.53%)
Mutual labels:  filter
porn-domains
A collection of domains used for explicit adult content like porn websites.
Stars: ✭ 97 (+185.29%)
Mutual labels:  filter
react-search-autocomplete
A search box that filters the provided array of objects
Stars: ✭ 152 (+347.06%)
Mutual labels:  filter
MacOSX-VFS-redirector
Mac OS X file system filter to redirect file operations
Stars: ✭ 38 (+11.76%)
Mutual labels:  filter
bloom
Probabilistic set data structure
Stars: ✭ 77 (+126.47%)
Mutual labels:  bloom-filter
amino-bot
Bot for aminoapps.com
Stars: ✭ 21 (-38.24%)
Mutual labels:  filter
animethemes-dl
THIS PROJECT HAS BEEN ABANDONED. Downloads anime themes from animethemes.moe. Supports Batch download and MAL/AniList connecting.
Stars: ✭ 21 (-38.24%)
Mutual labels:  filter
ntEdit
✏️ultra fast and scalable genome assembly polishing
Stars: ✭ 56 (+64.71%)
Mutual labels:  bloom-filter
esa-httpclient
An asynchronous event-driven HTTP client based on netty.
Stars: ✭ 82 (+141.18%)
Mutual labels:  filter
django-admin-search
Modal filter for django admin
Stars: ✭ 60 (+76.47%)
Mutual labels:  filter

Blex

Blex is a fast Bloom filter with concurrent accessibility, powered by :atomics module.

Features

  • Fixed size Bloom filter
  • Concurrent reads & writes
  • Serialization
  • Merge multiple Bloom filters into one
  • Only one copy of data because data is saved in either :atomics or binary (if > 64 bytes)
  • Custom hash functions

Example

iex> b = Blex.new(1000, 0.01)
iex> Task.async(fn -> Blex.put(b, "hello") end) |> Task.await()
iex> Task.async(fn -> Blex.put(b, "world") end) |> Task.await()
iex> Blex.member?(b, "hello")
true
iex> Blex.member?(b, "world")
true
iex> Blex.member?(b, "others")
false

Installation

Note: it requires OTP-21.2.1 or later. OTP-21.2 is not good due to a issue.

It can be installed by adding blex to your list of dependencies in mix.exs:

def deps do
  [
    {:blex, "~> 0.2"}
  ]
end

Documentation

Documentation can be found at hexdocs.pm/blex/Blex.html.

Benchmarking

Compare to alternative Bloom filter powered by :array module,

Blex is faster with read operation:

Operating System: macOS"
CPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.7.4
Erlang 21.2.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 21 s


Benchmarking Blex.members?...
Benchmarking Blex.members? with binary format...
Benchmarking Bloomex.members?...

Name                                       ips        average  deviation         median         99th %
Blex.members? with binary format          0.69         1.44 s     ±0.23%         1.44 s         1.44 s
Blex.members?                             0.63         1.58 s     ±0.61%         1.58 s         1.58 s
Bloomex.members?                          0.40         2.51 s     ±0.00%         2.51 s         2.51 s

Comparison:
Blex.members? with binary format          0.69
Blex.members?                             0.63 - 1.09x slower
Bloomex.members?                          0.40 - 1.74x slower

Blex is much faster with write operation:

Operating System: macOS"
CPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.7.4
Erlang 21.2.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 24 s


Benchmarking Blex.put...
Benchmarking Bloomex.add...

Name                  ips        average  deviation         median         99th %
Blex.put             0.44         2.25 s     ±3.98%         2.30 s         2.33 s
Bloomex.add         0.126         7.91 s     ±0.22%         7.91 s         7.92 s

Comparison:
Blex.put             0.44
Bloomex.add         0.126 - 3.51x slower

Above benchmarking script is available at bench/comparison.exs.

Implementation

Instead of traditional Bloom filter, partitioned Bloom filter (a variant Bloom filter described in section 3 of the paper) is used for performance benefits. The partitioned Bloom filter would partition bits array into k parts where k is number of hash functions. Each hash functions would only read & write bits from its own partitioned space. This would bring following benefits:

  • Reduce hash function (:erlang.phash2) calls for some cases.
  • Speed up Blex.estimate_size by scanning only part of bits.

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].