All Projects → nanoporetech → vbz_compression

nanoporetech / vbz_compression

Licence: MPL-2.0 license
VBZ compression plugin for nanopore signal data

Programming Languages

C++
36643 projects - #6 most used programming language
CMake
9771 projects
python
139335 projects - #7 most used programming language
c
50402 projects - #5 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to vbz compression

h5pp
A C++17 interface for HDF5
Stars: ✭ 60 (+93.55%)
Mutual labels:  hdf5
nanoseq
Nanopore demultiplexing, QC and alignment pipeline
Stars: ✭ 82 (+164.52%)
Mutual labels:  nanopore
AudioEffectDynamics
Dynamics Processor (Gate, Compressor & Limiter) for the Teensy Audio Library
Stars: ✭ 23 (-25.81%)
Mutual labels:  compression
mmtf
The specification of the MMTF format for biological structures
Stars: ✭ 40 (+29.03%)
Mutual labels:  compression
pcc geo cnn v2
Improved Deep Point Cloud Geometry Compression
Stars: ✭ 55 (+77.42%)
Mutual labels:  compression
ikeapack
Compact data serializer/packer written in Go, intended to produce a cross-language usable format.
Stars: ✭ 18 (-41.94%)
Mutual labels:  compression
AGD
[ICML2020] "AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks" by Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang
Stars: ✭ 98 (+216.13%)
Mutual labels:  compression
flowtorch
flowTorch - a Python library for analysis and reduced-order modeling of fluid flows
Stars: ✭ 47 (+51.61%)
Mutual labels:  hdf5
DNNAC
All about acceleration and compression of Deep Neural Networks
Stars: ✭ 29 (-6.45%)
Mutual labels:  compression
hasmin
Hasmin - A Haskell CSS Minifier
Stars: ✭ 55 (+77.42%)
Mutual labels:  compression
imagezero
Fast Lossless Color Image Compression Library
Stars: ✭ 49 (+58.06%)
Mutual labels:  compression
memscrimper
Code for the DIMVA 2018 paper: "MemScrimper: Time- and Space-Efficient Storage of Malware Sandbox Memory Dumps"
Stars: ✭ 25 (-19.35%)
Mutual labels:  compression
paq8pxd
No description or website provided.
Stars: ✭ 55 (+77.42%)
Mutual labels:  compression
SSffmpegVideoOperation
This is a library of FFmpeg for android... 📸 🎞 🚑
Stars: ✭ 261 (+741.94%)
Mutual labels:  compression
zlib
Compression and decompression in the gzip and zlib formats
Stars: ✭ 32 (+3.23%)
Mutual labels:  compression
MV-Tractus
A simple tool to extract motion vectors from h264 encoded videos.
Stars: ✭ 83 (+167.74%)
Mutual labels:  compression
upload-compression-plugin
Compress and decompress files on https://upload.io/
Stars: ✭ 21 (-32.26%)
Mutual labels:  compression
handlers
Go's HTTP handlers I use in my projects
Stars: ✭ 53 (+70.97%)
Mutual labels:  compression
tthresh
C++ compressor for multidimensional grid data using the Tucker decomposition
Stars: ✭ 35 (+12.9%)
Mutual labels:  compression
rhdf5
Package providing an interface between HDF5 and R
Stars: ✭ 50 (+61.29%)
Mutual labels:  hdf5

Oxford Nanopore Technologies logo

VBZ Compression

VBZ Compression uses variable byte integer encoding to compress nanopore signal data and is built using the following libraries:

The performance of VBZ is achieved by taking advantage of the properties of the raw signal and therefore is most effective when applied to the signal dataset. Other datasets you may have in your Fast5 files will not be able to take advantage of the default VBZ settings for compression. VBZ will be used as the default compression scheme in a future release of MinKNOW.

Installation

See the release section to find the installers for the hdf5 plugin.

Post installation you can then use HDFView, h5repack or h5py as you normally would:

# Invoke h5repack to pack input.fast5 into output.fast5
#
# The integer values specify how the data is packed:
#   - 32020: The id of the filter to apply (vbz in this case)
#   - 5: The number of following arguments
#   - 0: Filter flag for configuring filter version
#   - 0: Padding value for configuring filter version
#   - 2: Packing integers of size 2 bytes
#   - 1: Use zig zag encoding
#   - 1: Use zstd compression level 1
> h5repack -f UD=32020,5,0,0,2,1,1 input.fast5 output.fast5

# To compress 4 byte unsigned integers (no zig zag) with level 3 zstd you could use:
> h5repack -f UD=32020,5,0,0,4,0,3 input.h5 output.h5

# Invoke h5repack recursively on all reads using 10 processes
> find . -name "*.fast5" | xargs -P 10 -I % h5repack -f UD=32020,5,0,0,2,1,1 % %.vbz

# Invoke h5repack recursively on all reads storing the results inplace using 10 processes
> find . -name "*.fast5" | xargs -P 10 -I % sh -c "h5repack -f UD=32020,5,0,0,2,1,1 % %.vbz && mv %.vbz %"

Benchmarks

VBZ outperforms GZIP in both CPU time (>10X compression, >5X decompression) and compression (>30%).

Compression Ratio Compression Performance Decompression Performance

Development

To develop the plugin without conan you need the following installed:

and the following c++ dependencies

  • zstd development libraries available to cmake
  • hdf5 development libraries available to cmake (required for testing)

The following ubuntu packages provide these libraries:

  • libhdf5-dev
  • libzstd-dev

Then configure the project using:

> git submodule update --init
> mkdir build
> cd build
> cmake -D CMAKE_BUILD_TYPE=Release -D ENABLE_CONAN=OFF -D ENABLE_PERF_TESTING=OFF -D ENABLE_PYTHON=OFF ..
> make -j
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].