All Projects → osoco → PharoPDS

osoco / PharoPDS

Licence: MIT license
Probabilistic data structures in Pharo Smalltalk.

Programming Languages

smalltalk
420 projects
HTML
75241 projects

Projects that are alternatives of or similar to PharoPDS

bloom
Probabilistic set data structure
Stars: ✭ 77 (+175%)
Mutual labels:  bloom-filter
kendrick
Domain-Specific Modeling for Epidemiology
Stars: ✭ 43 (+53.57%)
Mutual labels:  pharo-smalltalk
PharoJS
PharoJS: Develop in Pharo, Run on JavaScript
Stars: ✭ 90 (+221.43%)
Mutual labels:  pharo-smalltalk
blex
Fast Bloom filter with concurrent accessibility, powered by :atomics module.
Stars: ✭ 34 (+21.43%)
Mutual labels:  bloom-filter
exor filter
Erlang nif for xor_filter. 'Faster and Smaller Than Bloom and Cuckoo Filters'.
Stars: ✭ 29 (+3.57%)
Mutual labels:  bloom-filter
crlite
WebPKI-level Certificate Revocation via Multi-Level Bloom Filter Cascade
Stars: ✭ 52 (+85.71%)
Mutual labels:  bloom-filter
CPPNotes
【C++ 面试 + C++ 学习指南】 一份涵盖大部分 C++ 程序员所需要掌握的核心知识。
Stars: ✭ 557 (+1889.29%)
Mutual labels:  data-structures-and-algorithms
bloomfilter
Bloom filters for Java
Stars: ✭ 53 (+89.29%)
Mutual labels:  bloom-filter
protobuf-smalltalk
Protocol buffers support for Smalltalk
Stars: ✭ 14 (-50%)
Mutual labels:  pharo-smalltalk
Cruiser
A Pharo Tool to package applications
Stars: ✭ 41 (+46.43%)
Mutual labels:  pharo-smalltalk
Doramon
个人工具汇总:一致性哈希工具,Bitmap工具,布隆过滤器参数生成器,Yaml和properties互转工具,一键式生成整个前后端工具,单机高性能幂等工具,zookeeper客户端工具,分布式全局id生成器,时间转换工具,Http封装工具
Stars: ✭ 53 (+89.29%)
Mutual labels:  bloom-filter
CVIP
C/C++/Golang/Linux...知识整理
Stars: ✭ 62 (+121.43%)
Mutual labels:  data-structures-and-algorithms
Winter
Winter is a 2D game engine for Pharo Smalltalk
Stars: ✭ 43 (+53.57%)
Mutual labels:  pharo-smalltalk
SmalltalkVimMode
Vim Mode for Playground, System Browser, Debugger in Pharo.
Stars: ✭ 39 (+39.29%)
Mutual labels:  pharo-smalltalk
libfilter
High-speed Bloom filters and taffy filters for C, C++, and Java
Stars: ✭ 23 (-17.86%)
Mutual labels:  bloom-filter
ntEdit
✏️ultra fast and scalable genome assembly polishing
Stars: ✭ 56 (+100%)
Mutual labels:  bloom-filter
Data-Structures-Algorithms-Hacktoberfest-2K19
collection of data structures and algorithms in different languages - created by the community during Hacktoberfest 2019
Stars: ✭ 34 (+21.43%)
Mutual labels:  data-structures-and-algorithms
bloom filter
Bloom filter implementation in Crystal lang
Stars: ✭ 33 (+17.86%)
Mutual labels:  bloom-filter
hackernews-button
Privacy-preserving Firefox extension linking to Hacker News discussion; built with Bloom filters and WebAssembly
Stars: ✭ 73 (+160.71%)
Mutual labels:  bloom-filter
cuckoo filter
High-performance, concurrent, and mutable Cuckoo Filter for Erlang and Elixir
Stars: ✭ 31 (+10.71%)
Mutual labels:  bloom-filter

PharoPDS

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Build Status Coverage Status Pharo version Pharo version License

The purpose of PharoPDS is to provide some probabilistic data structures and algorithms implemented in Pharo.

''Probabilistic data structures'' is a common name for data structures based mostly on different hashing techniques. Unlike regular and deterministic data structures, they always provide approximated answers but with reliable ways to estimate possible errors.

The potential losses and errors are fully compensated for by extremely low memory requirements, constant query time, and scaling. All these factors make these structures relevant in ''Big Data'' applications.

We've written some posts about the library and the historical and intellectual background of some ideas behind the approach we have followed:

Install PharoPDS

To install PharoPDS on your Pharo image you can just find it in the Pharo Project Catalog (World menu > Tools > Catalog Browser) and click in the green mark icon in the upper right corner to install the latest stable version:

Pharo Project Catalog with the project selected

Or, you can also execute the following script:

    Metacello new
      baseline: #ProbabilisticDataStructures;
    	repository: 'github://osoco/PharoPDS:master/src';
    	load

You can optionally install all the custom extensions and interactive tutorials included with the project executing the following script to install the group 'All':

    Metacello new
      baseline: #ProbabilisticDataStructures;
    	repository: 'github://osoco/PharoPDS:master/src';
    	load: 'All'

To add PharoPDS to your own project's baseline just add this:

    spec
    	baseline: #ProbabilisticDataStructures
    	with: [ spec repository: 'github://osoco/PharoPDS:master/src' ]

Note that you can replace the master by another branch or a tag.

Data Structures

Currently, PharoPDS provides probabilistic data structures for the following categories of problems:

Membership

A membership problem for a dataset is a task to decide whether some elements belongs to the dataset or not.

The data structures provided to solve the membership problem are the following:

  • Bloom Filter.

Cardinality

This is still a work in progress.

  • HyperLogLog

Moldable development

This library has been developed trying to apply the ideas after the moldable development approach, so you can expect that each data structure provides its own custom and domain-specific extensions in order to ease the understanding and learning by the developers.

For instance, the following pictures are some of the extensions provided by the Bloom filter:

Inspector on Bloom Filter - Parameters tab

Inspector on Bloom Filter - FPP tab

Inspector on Bloom Filter - Bits tab

Inspector on Bloom Filter - Analysis

Algorithms Browser

In order to ease the understanding of the inner workings and trade-offs, we provide specific Playground tools for each data structure that allows the developer to explore it and get deeper insights.

You can browse the available algorithm playgrounds through the PharoPDS Algorithms Browser. You can open it with the following expression:

PDSAlgorithmsBrowser open 

PDS Algorithms Browser

License

PharoPDS is written and supported by developers at OSOCO and published as free and open source software under an MIT license.

Project dependencies

Hashing plays a central role in probabilistic data structures. Indeed, the choice of the appropiate hash functions is crucial to avoid bias and to reach a good performance. In particular, the structures require non-cryptographic hash functions that are provided by the dependency module NonCryptographicHashes.

Other dependencies like Roassal or GToolkit are optional for production use. Nevertheless, we recommend that you install them in the development image if you want to get some useful tools like Inspector custom extensions, the algorithm browser or interactive tutorials.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].