All Projects → lemire → Simdcomp

lemire / Simdcomp

Licence: bsd-3-clause
A simple C library for compressing lists of integers using binary packing

Programming Languages

c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Simdcomp

Turbopfor Integer Compression
Fastest Integer Compression
Stars: ✭ 520 (+57.1%)
Mutual labels:  simd, compression
Simdcompressionandintersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
Stars: ✭ 289 (-12.69%)
Mutual labels:  simd, compression
Mango
mango fun framework
Stars: ✭ 343 (+3.63%)
Mutual labels:  simd, compression
Compressed Vec
SIMD Floating point and integer compressed vector library
Stars: ✭ 25 (-92.45%)
Mutual labels:  simd, compression
Maskedvbyte
Fast decoder for VByte-compressed integers
Stars: ✭ 91 (-72.51%)
Mutual labels:  simd, compression
Turbo-Transpose
Transpose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-84.89%)
Mutual labels:  compression, simd
Streamvbyte
Fast integer compression in C using the StreamVByte codec
Stars: ✭ 195 (-41.09%)
Mutual labels:  simd, compression
ndzip
A High-Throughput Parallel Lossless Compressor for Scientific Data
Stars: ✭ 19 (-94.26%)
Mutual labels:  compression, simd
Std Simd
std::experimental::simd for GCC [ISO/IEC TS 19570:2018]
Stars: ✭ 275 (-16.92%)
Mutual labels:  simd
Highway
Performance-portable, length-agnostic SIMD with runtime dispatch
Stars: ✭ 301 (-9.06%)
Mutual labels:  simd
Zipper
C++ wrapper around minizip compression library
Stars: ✭ 272 (-17.82%)
Mutual labels:  compression
Variational Dropout Sparsifies Dnn
Sparse Variational Dropout, ICML 2017
Stars: ✭ 278 (-16.01%)
Mutual labels:  compression
Fastbase64
SIMD-accelerated base64 codecs
Stars: ✭ 309 (-6.65%)
Mutual labels:  simd
Gozstd
go wrapper for zstd
Stars: ✭ 275 (-16.92%)
Mutual labels:  compression
Zoonavigator
Web-based ZooKeeper UI / editor / browser
Stars: ✭ 326 (-1.51%)
Mutual labels:  compression
Graphene
A thin layer of graphic data types
Stars: ✭ 268 (-19.03%)
Mutual labels:  simd
Compress
Collection of compression related Go packages.
Stars: ✭ 319 (-3.63%)
Mutual labels:  compression
Tinify Nodejs
Node.js client for the Tinify API.
Stars: ✭ 299 (-9.67%)
Mutual labels:  compression
Smidge
A lightweight runtime CSS/JavaScript file minification, combination, compression & management library for ASP.Net Core
Stars: ✭ 267 (-19.34%)
Mutual labels:  compression
Tsimd
Fundamental C++ SIMD types for Intel CPUs (sse, avx, avx2, avx512)
Stars: ✭ 290 (-12.39%)
Mutual labels:  simd

The SIMDComp library

Build Status Build Status Code Quality: Cpp

A simple C library for compressing lists of integers using binary packing and SIMD instructions. The assumption is either that you have a list of 32-bit integers where most of them are small, or a list of 32-bit integers where differences between successive integers are small. No software is able to reliably compress an array of 32-bit random numbers.

This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.

On a Skylake Intel processor, it can decode integers at a rate 0.3 cycles per integer, which can easily translate into more than 8 decoded billions integers per second.

This library is part of the Awesome C list of C ressources.

Contributors: Daniel Lemire, Nathan Kurz, Christoph Rupp, Anatol Belski, Nick White and others

What is it for?

This is a low-level library for fast integer compression. By design it does not define a compressed format. It is up to the (sophisticated) user to create a compressed format.

It is used by:

Requirements

  • Your processor should support SSE4.1 (It is supported by most Intel and AMD processors released since 2008.)
  • It is possible to build the core part of the code if your processor support SSE2 (Pentium4 or better)
  • C99 compliant compiler (GCC is assumed)
  • A Linux-like distribution is assumed by the makefile

For a plain C version that does not use SIMD instructions, see https://github.com/lemire/LittleIntPacker

Usage

Compression works over blocks of 128 integers.

For a complete working example, see example.c (you can build it and run it with "make example; ./example").

  1. Lists of integers in random order.
const uint32_t b = maxbits(datain);// computes bit width
simdpackwithoutmask(datain, buffer, b);//compressed to buffer, compressing 128 32-bit integers down to b*32 bytes
simdunpack(buffer, backbuffer, b);//uncompressed to backbuffer

While 128 32-bit integers are read, only b 128-bit words are written. Thus, the compression ratio is 32/b.

  1. Sorted lists of integers.

We used differential coding: we store the difference between successive integers. For this purpose, we need an initial value (called offset).

uint32_t offset = 0;
uint32_t b1 = simdmaxbitsd1(offset,datain); // bit width
simdpackwithoutmaskd1(offset, datain, buffer, b1);//compressing 128 32-bit integers down to b1*32 bytes
simdunpackd1(offset, buffer, backbuffer, b1);//uncompressed

General example for arrays of arbitrary length:

int compress_decompress_demo() {
  size_t k, N = 9999;
  __m128i * endofbuf;
  uint32_t * datain = malloc(N * sizeof(uint32_t));
  uint8_t * buffer;
  uint32_t * backbuffer = malloc(N * sizeof(uint32_t));
  uint32_t b;

  for (k = 0; k < N; ++k){        /* start with k=0, not k=1! */
    datain[k] = k;
  }

  b = maxbits_length(datain, N);
  buffer = malloc(simdpack_compressedbytes(N,b)); // allocate just enough memory
  endofbuf = simdpack_length(datain, N, (__m128i *)buffer, b);
  /* compressed data is stored between buffer and endofbuf using (endofbuf-buffer)*sizeof(__m128i) bytes */
  /* would be safe to do : buffer = realloc(buffer,(endofbuf-(__m128i *)buffer)*sizeof(__m128i)); */
  simdunpack_length((const __m128i *)buffer, N, backbuffer, b);

  for (k = 0; k < N; ++k){
    if(datain[k] != backbuffer[k]) {
      printf("bug\n");
      return -1;
    }
  }
  return 0;
}
  1. Frame-of-Reference

We also have frame-of-reference (FOR) functions (see simdfor.h header). They work like the bit packing routines, but do not use differential coding so they allow faster search in some cases, at the expense of compression.

Setup

make make test

and if you are daring:

make install

Go

If you are a go user, there is a "go" folder where you will find a simple demo.

Other libraries

Other programming languages

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].