All Projects → fknorr → ndzip

fknorr / ndzip

Licence: MIT license
A High-Throughput Parallel Lossless Compressor for Scientific Data

Programming Languages

C++
36643 projects - #6 most used programming language
Cuda
1817 projects
c
50402 projects - #5 most used programming language
CMake
9771 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ndzip

Turbo-Transpose
Transpose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (+163.16%)
Mutual labels:  compression, simd, floating-point
fpzip
Lossless compressor of multidimensional floating-point arrays
Stars: ✭ 58 (+205.26%)
Mutual labels:  compression, floating-point
Mango
mango fun framework
Stars: ✭ 343 (+1705.26%)
Mutual labels:  compression, simd
Simdcomp
A simple C library for compressing lists of integers using binary packing
Stars: ✭ 331 (+1642.11%)
Mutual labels:  compression, simd
Maskedvbyte
Fast decoder for VByte-compressed integers
Stars: ✭ 91 (+378.95%)
Mutual labels:  compression, simd
Turbopfor Integer Compression
Fastest Integer Compression
Stars: ✭ 520 (+2636.84%)
Mutual labels:  compression, simd
Simdcompressionandintersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
Stars: ✭ 289 (+1421.05%)
Mutual labels:  compression, simd
Compressed Vec
SIMD Floating point and integer compressed vector library
Stars: ✭ 25 (+31.58%)
Mutual labels:  compression, simd
Streamvbyte
Fast integer compression in C using the StreamVByte codec
Stars: ✭ 195 (+926.32%)
Mutual labels:  compression, simd
fpzip
Cython bindings for fpzip, a floating point image compression algorithm.
Stars: ✭ 24 (+26.32%)
Mutual labels:  compression, floating-point
frp
FRP: Fast Random Projections
Stars: ✭ 40 (+110.53%)
Mutual labels:  simd
Intriman
Intriman is a documentation generator that retargets the Intel Intrinsics Guide to other documentation formats
Stars: ✭ 25 (+31.58%)
Mutual labels:  simd
Drachennest
Different algorithms for converting binary to decimal floating-point numbers
Stars: ✭ 60 (+215.79%)
Mutual labels:  floating-point
rocketjob
Ruby's missing background and batch processing system
Stars: ✭ 281 (+1378.95%)
Mutual labels:  compression
simdjson-rs
Rust version of lemire's SimdJson
Stars: ✭ 18 (-5.26%)
Mutual labels:  simd
JFileSync3
File Syncing with encryption and compression (partly) compatible with encfs / boxcryptor (classic) volumes for local folders and WebDAV backends. Based on JFileSync - hence the name.
Stars: ✭ 20 (+5.26%)
Mutual labels:  compression
heyoka
C++ library for ODE integration via Taylor's method and LLVM
Stars: ✭ 151 (+694.74%)
Mutual labels:  simd
FFmpegPlayer
Simple FFmpeg video player
Stars: ✭ 72 (+278.95%)
Mutual labels:  simd
raisin
A simple lightweight set of implementations and bindings for compression algorithms written in Go.
Stars: ✭ 17 (-10.53%)
Mutual labels:  compression
deflate-rs
An implementation of a DEFLATE encoder in rust
Stars: ✭ 47 (+147.37%)
Mutual labels:  compression

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

ndzip compresses and decompresses multidimensional univariate grids of single- and double-precision IEEE 754 floating-point data. We implement

  • a single-threaded CPU compressor
  • an OpenMP-backed multi-threaded compressor
  • a SYCL-based GPU compressor (currently hipSYCL + NVIDIA only)
  • a CUDA-based GPU compressor (experimental)

All variants generate and decode bit-identical compressed stream.

ndzip is currently a research project with the primary use case of speeding up distributed HPC applications by increasing effective interconnect bandwidth.

Prerequisites

  • CMake >= 3.15
  • Clang >= 10.0.0
  • Linux (tested on x86_64 and POWER9)
  • Boost >= 1.66
  • Catch2 >= 2.13.3 (optional, for unit tests and microbenchmarks)

Additionaly, for GPU support

  • CUDA >= 11.0 (not officially compatible with Clang 10/11, but a lower version will optimize insufficiently!)
  • An Nvidia GPU with Compute Capability >= 6.1
  • For the SYCL version: A recent version of hipSYCL profiling functionality

Building

Make sure to set the right build type and enable the full instruction set of the target CPU architecture:

-DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native"

If unit tests and microbenchmarks should also be built, add

-DNDZIP_BUILD_TEST=YES

For GPU support with SYCL

  1. Build and install hipSYCL
git clone https://github.com/illuhad/hipSYCL
cd hipSYCL
cmake -B build -DCMAKE_INSTALL_PREFIX=../hipSYCL-install -DWITH_CUDA_BACKEND=YES -DCMAKE_BUILD_TYPE=Release
cmake --build build --target install -j
  1. Build ndzip with SYCL
cmake -B build -DCMAKE_PREFIX_PATH='../hipSYCL-install/lib/cmake' -DHIPSYCL_PLATFORM=cuda -DCMAKE_CUDA_ARCHITECTURES=75 -DHIPSYCL_GPU_ARCH=sm_75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-U__FLOAT128__ -U__SIZEOF_FLOAT128__ -march=native"
cmake --build build -j

Replace sm_75 and 75 with the string matching your GPU's Compute Capability. The -U__FLOAT128__ define is required due to an open bug in Clang.

For GPU support with CUDA (experimental)

a) Either build ndzip with CUDA + NVCC ...

cmake -B build -DCMAKE_CUDA_ARCHITECTURES=75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native"
cmake --build build -j

Replace sm_75 and 75 with the string matching your GPU's Compute Capability.

b) ... or with CUDA + Clang

cmake -B build -DCMAKE_CUDA_COMPILER="$(which clang++)" -DCMAKE_CUDA_ARCHITECTURES=75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-U__FLOAT128__ -U__SIZEOF_FLOAT128__ -march=native"
cmake --build build -j

The -U__FLOAT128__ define is required due to an open bug in Clang.

Compressing and decompressing files

build/compress -n <size> -i <uncompressed-file> -o <compressed-file> [-t float|double]
build/compress -d -n <size> -i <compressed-file> -o <decompressed-file> [-t float|double]

<size> are one to three arguments depending on the dimensionality of the input grid. In the multi-dimensional case, the first number specifies the width of the slowest-iterating dimension.

By default, compress uses the single-threaded CPU compressor. Passing -e cpu-mt or -e sycl / -e cuda selects the multi-threaded CPU compressor or the GPU compressor if available, respectively.

Running unit tests

Only available if tests have been enabled during build.

build/encoder_test
build/sycl_bits_test  # only if built with SYCL support
build/sycl_ubench     # GPU microbenchmarks, only if built with SYCL support
build/cuda_bits_test  # only if built with CUDA support

See also

References

If you are using ndzip as part of your research, we kindly ask you to cite

  • Fabian Knorr, Peter Thoman, and Thomas Fahringer. "ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data". In 2021 Data Compression Conference (DCC), IEEE, 2021. [DOI] Preprint PDF

  • Knorr, Fabian, Peter Thoman, and Thomas Fahringer. "ndzip-gpu: efficient lossless compression of scientific floating-point data on GPUs". In SC'21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2021. [DOI] [Preprint PDF]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].