Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → amit1rrr → Numcompress

amit1rrr / Numcompress

Licence: mit

Python package to compress numerical series & numpy arrays into strings

Programming Languages

139335 projects - #7 most used programming language

Labels

compression decompression

Projects that are alternatives of or similar to Numcompress

bzip2 for Ruby

Stars: ✭ 39 (-42.65%)

Mutual labels: compression, decompression

⚡ A compression library that implements many compression algorithms such as LZ4, Zstd, LZMA, Snappy, Brotli, GZip, and Deflate. It helps you to improve performance by reducing Memory Usage and Network Traffic for caching.

Stars: ✭ 167 (+145.59%)

Mutual labels: compression, decompression

A simple lightweight set of implementations and bindings for compression algorithms written in Go.

Stars: ✭ 17 (-75%)

Mutual labels: compression, decompression

An effective time-series data compression/decompression method based on Facebook's Gorilla.

Stars: ✭ 51 (-25%)

Mutual labels: compression, decompression

Brotli compressor and decompressor written in rust that optionally avoids the stdlib

Stars: ✭ 504 (+641.18%)

Mutual labels: compression, decompression

zstd-decoder in pure rust

Stars: ✭ 148 (+117.65%)

Mutual labels: compression, decompression

Thin Python wrapper to de/compression algorithms in Rust - lightweight & no dependencies

Stars: ✭ 40 (-41.18%)

Mutual labels: compression, decompression

Small inflate/deflate implementation in ~300 LoC of ANSI C

Stars: ✭ 120 (+76.47%)

Mutual labels: compression, decompression

Pure golang package for reading and writing xz-compressed files

Stars: ✭ 330 (+385.29%)

Mutual labels: compression, decompression

Pure OCaml implementation of Zlib.

Stars: ✭ 103 (+51.47%)

Mutual labels: compression, decompression

A C++ compression program based on Huffman's lossless compression algorithm and decoder.

Stars: ✭ 81 (+19.12%)

Mutual labels: compression, decompression

High performance (de)compression in an 8kB package

Stars: ✭ 547 (+704.41%)

Mutual labels: compression, decompression

The Minimal LZMA (minlzma) project aims to provide a minimalistic, cross-platform, highly commented, standards-compliant C library (minlzlib) for decompressing LZMA2-encapsulated compressed data in LZMA format within an XZ container, as can be generated with Python 3.6, 7-zip, and xzutils

Stars: ✭ 236 (+247.06%)

Mutual labels: compression, decompression

POWER9 gzip engine documentation and code samples

Stars: ✭ 16 (-76.47%)

Mutual labels: compression, decompression

A collection of useful utility functions

Stars: ✭ 201 (+195.59%)

Mutual labels: compression, decompression

Full C# port of Brotli compression algorithm

Stars: ✭ 77 (+13.24%)

Mutual labels: compression, decompression

Optimized Go Compression Packages

Stars: ✭ 2,478 (+3544.12%)

Mutual labels: compression, decompression

A library for some loosely related Microsoft compression formats, CAB, CHM, HLP, LIT, KWAJ and SZDD.

Stars: ✭ 104 (+52.94%)

Mutual labels: compression, decompression

A C++ compression and decompression program based on Huffman Coding.

Stars: ✭ 31 (-54.41%)

Mutual labels: compression, decompression

Lepton is a tool and file format for losslessly compressing JPEGs by an average of 22%.

Stars: ✭ 4,918 (+7132.35%)

Mutual labels: compression, decompression

View All Similar Projects ➔

numcompress

Simple way to compress and decompress numerical series & numpy arrays.

Easily gets you above 80% compression ratio
You can specify the precision you need for floating points (up to 10 decimal points)
Useful to store or transmit stock prices, monitoring data & other time series data in compressed string format

Compression algorithm is based on google encoded polyline format. I modified it to preserve arbitrary precision and apply it to any numerical series. The work is motivated by usefulness of time aware polyline built by Arjun Attam at HyperTrack. After building this I came across arrays that are much efficient than lists in terms memory footprint. You might consider using that over numcompress if you don't care about conversion to string for transmitting or storing purpose.

Installation

pip install numcompress

Usage

from numcompress import compress, decompress

# Integers
>>> compress([14578, 12759, 13525])
'[email protected]'

>>> decompress('[email protected]')
[14578.0, 12759.0, 13525.0]

# Floats - lossless compression
# precision argument specifies how many decimal points to preserve, defaults to 3
>>> compress([145.7834, 127.5989, 135.2569], precision=4)
'Csi~wAhdbJgqtC'

>>> decompress('Csi~wAhdbJgqtC')
[145.7834, 127.5989, 135.2569]

# Floats - lossy compression
>>> compress([145.7834, 127.5989, 135.2569], precision=2)
'Acn[rpB{[email protected]'

>>> decompress('Acn[rpB{[email protected]')
[145.78, 127.6, 135.26]

# compressing and decompressing numpy arrays
>>> from numcompress import compress_ndarray, decompress_ndarray
>>> import numpy as np

>>> series = np.random.randint(1, 100, 25).reshape(5, 5)
>>> compressed_series = compress_ndarray(series)
>>> decompressed_series = decompress_ndarray(compressed_series)

>>> series
array([[29, 95, 10, 48, 20],
       [60, 98, 73, 96, 71],
       [95, 59,  8,  6, 17],
       [ 5, 12, 69, 65, 52],
       [84,  6, 83, 20, 50]])

>>> compressed_series
'5*5,[email protected]_|[email protected][email protected]|[email protected]@_{[email protected]~heAnrbB~{BonT~lVotLoinB~xFnkX_o}@~iwCokuCn`[email protected]'

>>> decompressed_series
array([[29., 95., 10., 48., 20.],
       [60., 98., 73., 96., 71.],
       [95., 59.,  8.,  6., 17.],
       [ 5., 12., 69., 65., 52.],
       [84.,  6., 83., 20., 50.]])

>>> (series == decompressed_series).all()
True

Compression Ratio

Test	# of Numbers	Compression ratio
Integers	10k	91.14%
Floats	10k	81.35%

You can run the test suite with -s switch to see the compression ratio. You can even modify the tests to see what kind of compression ratio you will get for your own input.

pytest -s

Here's a quick example showing compression ratio:

>>> series = random.sample(range(1, 100000), 50000)  # generate 50k random numbers between 1 and 100k
>>> text = compress(series)  # apply compression

>>> original_size = sum(sys.getsizeof(i) for i in series)
>>> original_size
1200000

>>> compressed_size = sys.getsizeof(text)
>>> compressed_size
284092

>>> compression_ratio = ((original_size - compressed_size) * 100.0) / original_size
>>> compression_ratio
76.32566666666666

We get ~76% compression for 50k random numbers between 1 & 100k. This ratio increases for real world numerical series as the difference between consecutive numbers tends to be lower. Think of stock prices, monitoring & other time series data.

Contribute

If you see any problem, open an issue or send a pull request. You can write to [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 68

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗