All Projects → searchivarius → PyFastPFor

searchivarius / PyFastPFor

Licence: other
Python bindings for the fast integer compression library FastPFor.

Programming Languages

C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to PyFastPFor

apultra
Free open-source compressor for apLib with 5-7% better ratios
Stars: ✭ 84 (+75%)
Mutual labels:  compression-algorithm
blz4
Example of LZ4 compression with optimal parsing using BriefLZ algorithms
Stars: ✭ 24 (-50%)
Mutual labels:  compression-algorithm
Re-Pair
Offline Dictionary-based Compression (Re-Pair, Recursive Pairing)
Stars: ✭ 21 (-56.25%)
Mutual labels:  compression-algorithm
brieflz
Small fast Lempel-Ziv compression library
Stars: ✭ 84 (+75%)
Mutual labels:  compression-algorithm
Huffman-Coding
A C++ compression program based on Huffman's lossless compression algorithm and decoder.
Stars: ✭ 81 (+68.75%)
Mutual labels:  compression-algorithm
Lepton
Lepton is a tool and file format for losslessly compressing JPEGs by an average of 22%.
Stars: ✭ 4,918 (+10145.83%)
Mutual labels:  compression-algorithm
image-comp-lib-rust
Image Compression Algorithm
Stars: ✭ 30 (-37.5%)
Mutual labels:  compression-algorithm
lzbase62
LZ77(LZSS) based compression algorithm in base62 for JavaScript.
Stars: ✭ 38 (-20.83%)
Mutual labels:  compression-algorithm
Jampack
Experimental parallel compression algorithm
Stars: ✭ 21 (-56.25%)
Mutual labels:  compression-algorithm
django-brotli
Django middleware that compresses response using brotli algorithm.
Stars: ✭ 16 (-66.67%)
Mutual labels:  compression-algorithm
x-compressor
x – minimalist data compressor
Stars: ✭ 42 (-12.5%)
Mutual labels:  compression-algorithm
decentralized-ml
Full stack service enabling decentralized machine learning on private data
Stars: ✭ 50 (+4.17%)
Mutual labels:  compression-algorithm
salvador
A free, open-source compressor for the ZX0 format
Stars: ✭ 35 (-27.08%)
Mutual labels:  compression-algorithm

PyPI version Downloads Build Status

PyFastPFor

Python bindings for the fast light-weight integer compression library FastPFor: A research library with integer compression schemes. FastPFor is broadly applicable to the compression of arrays of 32-bit integers where most integers are small. The library seeks to exploit SIMD instructions (SSE) whenever possible. This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.

Authors

Daniel Lemire, Leonid Boytsov, Owen Kaser, Maxime Caron, Louis Dionne, Michel Lemay, Erik Kruus, Andrea Bedini, Matthias Petri, Robson Braga Araujo, Patrick Damme. Bindings are created by Leonid Boytsov.

Installation

Bindings can be installed locally:

cd python_bindings
pip install -r requirements.txt
sudo setup.py build install

or via pip:

pip install pyfastpfor

Due to some compilation quirks this currently seem to work with GCC only. I will fix it in some not so distant future. You may also need to install Python dev-files. On Ubuntu, for Python 3 you can do it as follows:

sudo apt-get install python3-dev

Documentation

The library supports all the codecs implemented in the original FastPFor library by Feb 2018. To get a list of codecs, use the function getCodecList.

Typical light-weight compression does not take context into account and, consequently, works well only for small integers. When integers are large, data differencing is a common trick to make integers small. In particular, we often deal with sorted lists of integers, which can be represented by differences between neighboring numbers.

The smallest differences (fine deltas) are between adjacent numbers. Respective differencing and difference inverting functions are delta1'' and prefixSum1''.

However, we can do reasonably well, we compute differences between numbers that are four positions apart (coarse deltas). Such differences can be computed and inverted more efficiently. Respective differencing and difference inverting functions are delta4'' and prefixSum4''.

Examples of three common use scenarios (no differencing, coarse and fine deltas) are outlined in this Python notebook.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].