All Projects → go-compression → raisin

go-compression / raisin

Licence: MIT license
A simple lightweight set of implementations and bindings for compression algorithms written in Go.

Programming Languages

go
31211 projects - #10 most used programming language
python
139335 projects - #7 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to raisin

Huffman-Coding
A C++ compression program based on Huffman's lossless compression algorithm and decoder.
Stars: ✭ 81 (+376.47%)
Mutual labels:  compression, huffman, decompression
decompress
Pure OCaml implementation of Zlib.
Stars: ✭ 103 (+505.88%)
Mutual labels:  compression, huffman, decompression
Georaptor
Python Geohash Compression Tool
Stars: ✭ 143 (+741.18%)
Mutual labels:  compression, command-line-tool
Util
A collection of useful utility functions
Stars: ✭ 201 (+1082.35%)
Mutual labels:  compression, decompression
em inflate
Fast, small, in-memory inflate (zlib, deflate and gzip decompression)
Stars: ✭ 59 (+247.06%)
Mutual labels:  compression, huffman
Wavefile
Create, read and write wav files according to the specs. ⭐️ 🎶 ❤️
Stars: ✭ 100 (+488.24%)
Mutual labels:  compression, command-line-tool
Libmspack
A library for some loosely related Microsoft compression formats, CAB, CHM, HLP, LIT, KWAJ and SZDD.
Stars: ✭ 104 (+511.76%)
Mutual labels:  compression, decompression
Precomp Cpp
Precomp, C++ version - further compress already compressed files
Stars: ✭ 250 (+1370.59%)
Mutual labels:  compression, command-line-tool
Lepton
Lepton is a tool and file format for losslessly compressing JPEGs by an average of 22%.
Stars: ✭ 4,918 (+28829.41%)
Mutual labels:  compression, decompression
kanzi-cpp
Lossless data compression in C++
Stars: ✭ 60 (+252.94%)
Mutual labels:  compression, huffman
gorilla
An effective time-series data compression/decompression method based on Facebook's Gorilla.
Stars: ✭ 51 (+200%)
Mutual labels:  compression, decompression
zstd-rs
zstd-decoder in pure rust
Stars: ✭ 148 (+770.59%)
Mutual labels:  compression, decompression
Numcompress
Python package to compress numerical series & numpy arrays into strings
Stars: ✭ 68 (+300%)
Mutual labels:  compression, decompression
Compressonator
Tool suite for Texture and 3D Model Compression, Optimization and Analysis using CPUs, GPUs and APUs
Stars: ✭ 785 (+4517.65%)
Mutual labels:  compression, decompression
Sdefl
Small inflate/deflate implementation in ~300 LoC of ANSI C
Stars: ✭ 120 (+605.88%)
Mutual labels:  compression, decompression
Fflate
High performance (de)compression in an 8kB package
Stars: ✭ 547 (+3117.65%)
Mutual labels:  compression, decompression
Minlzma
The Minimal LZMA (minlzma) project aims to provide a minimalistic, cross-platform, highly commented, standards-compliant C library (minlzlib) for decompressing LZMA2-encapsulated compressed data in LZMA format within an XZ container, as can be generated with Python 3.6, 7-zip, and xzutils
Stars: ✭ 236 (+1288.24%)
Mutual labels:  compression, decompression
Xz
Pure golang package for reading and writing xz-compressed files
Stars: ✭ 330 (+1841.18%)
Mutual labels:  compression, decompression
Rust Brotli
Brotli compressor and decompressor written in rust that optionally avoids the stdlib
Stars: ✭ 504 (+2864.71%)
Mutual labels:  compression, decompression
gpuhd
Massively Parallel Huffman Decoding on GPUs
Stars: ✭ 30 (+76.47%)
Mutual labels:  huffman, decompression

Raisin

Build Status Go Report Card Coverage Status Documentation

View benchmarks for latest deployment

A simple lightweight set of implementations and bindings for compression algorithms written in Go.

This project contains the source code for a summer mentorship about learning how to implement and create different compression algorithms. This includes common algorithms such as huffman, lempel-ziv, and arithmetic along with bindings for builtin Go compression algorithms.

Usage from the CLI

To start using this package from the command line, install it with go get and go install

$ go get -u github.com/go-compression/raisin
$ go install github.com/go-compression/raisin/cmd/...

Once done, you should be able to start using it

$ echo "Hello world!" > test.txt
$ raisin test.txt
Compressing...
Original bytes: 13
Compressed bytes: 14
Compression ratio: 107.69%
$ cat test.txt.rsn
��ӷ     �?��KD+

$ rm test.txt
$ raisin -decompress test.txt.rsn
Decompressing...
$ cat test.txt
Hello world!

The possible commands include:

  • -compress - Compress a given file and output the compressed contents to a file with ".rsn" at the end
  • -decompress - Decompress a given file and output the decompressed contents to a file without ".rsn" at the end
  • -benchmark - Benchmark a given file and measure the compression ratio, outputs a .rsn and a .decompressed file

The most important flag is the -algorithm flag which allows you to specify which algorithm to use during compression, decompression, or benchmarking. By default for compress and decompress this is lzss,arithmetic. The possible algorithms include:

  • lzss
  • dmc
  • huffman
  • mcc
  • arithmetic
  • flate
  • gzip
  • lzw
  • zlib

Here's an example of usage:

$ raisin -algorithm=arithmetic test.txt

You can also combine algorithms together in "layers", this will essentially compress the file with the first algorithm, then the second, etc. This stacking of algorithms is what powers virtually all modern compression, gzip and zip is powered by the FLATE algorithm which is essentially lempel-ziv (similar to lzss) and huffman coding stacked on toip of each other.

$ raisin -algorithm=lzss,huffman test.txt
Compressing...
Compression ratio: 307.69%
$ raisin -decompress -algorithm=lzss,huffman test.txt.rsn
Decompressing...

On top of this, you can easily compress or decompress multiple files by chaining them together with commas.

$ raisin test1.txt,test2.txt
Compressing...
Compression ratio: 68.53%
$ ls
test1.txt  test1.txt.rsn  test2.txt  test2.txt.rsn

When using compress and decompress a few more options become available to make it easy to use from the command line:

  • delete - Delete original file after compression/decompressed (defaults to true for decompression)
  • out - File name to be outputted (defaults to original file + .rsn for compression and file - .rsn for decompression, only available with a single file being compressed/decompressed)
  • outext - File extension to be outputted when compressing multiple files (unavailable with a single file being compressed/decompressed)

Let's take at the usage of delete, keep in mind that delete is on by default for decompressing.

$ echo "Hello world!" > test.txt
$ raisin -delete test.txt
Compressing...
Compression ratio: 107.69%
$ ls
test.txt.rsn
$ raisin -decompress -delete test.txt.rsn
Decompressing...
$ ls
test.txt
$ raisin -delete=false test.txt
Compressing...
Compression ratio: 107.69%
$ ls
test.txt  test.txt.rsn

The out command simply lets you change what file is outputted when compressing a single file:

$ echo "Hello world!" > test.txt
$ raisin -out=compressed.txt test.txt
Compressing...
Compression ratio: 107.69%
$ ls
test.txt  compressed.txt
$ raisin -decompress -out=decompressed.txt compressed.txt
Decompressing...
$ ls
test.txt  decompressed.txt

outext is similar to out but exists for when we compress/decompress multiple files. If outext is provided, it will be used as the output extension for the files. Note that the default for compression for outext is .rsn and for decompression it's an empty string (outext=) which tells the program to remove the last extension.

$ ls
test1.txt  test2.txt  test3.txt
$ raisin -delete -outext=.testing test1.txt,test2.txt,test3.txt
Compressing...
Compression ratio: 107.69%
$ ls
test1.txt.testing  test2.txt.testing  test3.txt.testing
$ raisin -decompress -outext=.decompressed test1.txt.testing,test2.txt.testing,test3.txt.testing
Decompressing...
$ ls
test1.txt  test2.txt  test3.txt

Benchmarking

You can use the benchmark command to generate benchmarked results for a set of algorithms, layers, and files. This is helpful for generating results in a table, website, or in bindings for other languages such as python (see the ai folder).

Usage is relatively similar to the compress and decompress commands.

$ echo "Hello world!" > test.txt
$ echo "abcabcabcabcabcabcabcabc" > test2.txt
$ raisin -benchmark -algorithm=lzss,huffman,arithmetic,gzip,[lzss,arithmetic] test.txt,test2.txt
┌─────────────────┬────────────┬───────────────────┬────────────────┬─────────────────────┬──────────┐
│ ENGINE          │ TIME TAKEN │ COMPRESSION RATIO │ ACTUAL ENTROPY │ THEORETICAL ENTROPY │ LOSSLESS │
├─────────────────┼────────────┼───────────────────┼────────────────┼─────────────────────┼──────────┤
│ lzss            │ 350µs      │ 100.00%           │ 2.20           │ 2.20                │ true     │
│ arithmetic      │ 50µs       │ 107.69%           │ 2.12           │ 2.20                │ true     │
│ lzss,arithmetic │ 210µs      │ 107.69%           │ 2.12           │ 2.20                │ true     │
│ gzip            │ 280µs      │ 284.62%           │ 1.14           │ 2.20                │ true     │
│ huffman         │ 190µs      │ 307.69%           │ 1.08           │ 2.20                │ true     │
├─────────────────┼────────────┼───────────────────┼────────────────┼─────────────────────┼──────────┤
│ File            │ test.txt   │ Size              │ 13 B           │                     │          │
└─────────────────┴────────────┴───────────────────┴────────────────┴─────────────────────┴──────────┘
┌─────────────────┬────────────┬───────────────────┬────────────────┬─────────────────────┬──────────┐
│ ENGINE          │ TIME TAKEN │ COMPRESSION RATIO │ ACTUAL ENTROPY │ THEORETICAL ENTROPY │ LOSSLESS │
├─────────────────┼────────────┼───────────────────┼────────────────┼─────────────────────┼──────────┤
│ lzss,arithmetic │ 160µs      │ 84.00%            │ 1.25           │ 1.22                │ true     │
│ lzss            │ 310µs      │ 84.00%            │ 1.25           │ 1.22                │ true     │
│ arithmetic      │ 160µs      │ 84.00%            │ 1.25           │ 1.22                │ true     │
│ huffman         │ 170µs      │ 92.00%            │ 1.24           │ 1.22                │ true     │
│ gzip            │ 430µs      │ 120.00%           │ 1.17           │ 1.22                │ true     │
├─────────────────┼────────────┼───────────────────┼────────────────┼─────────────────────┼──────────┤
│ File            │ test2.txt  │ Size              │ 25 B           │                     │          │
└─────────────────┴────────────┴───────────────────┴────────────────┴─────────────────────┴──────────┘

A larger example, taken from the .travis.yml file to generate the benchmark page. Notice the -generate flag, this tells it to generate an html file and output it as index.html, which is then used and uploaded to the GitHub Pages branch. Keep in mind the program expects a template file to be at templates/benchmark.html relative to your working directory. The command is as follows:

$ raisin -benchmark -generate -algorithm=lzss,dmc,huffman,flate,gzip,lzw,zlib,arithmetic,[lzss,huffman],[lzss,arithmetic],[arithmetic,huffman] alice29.txt,asyoulik.txt,cp.html,fields.c,grammar.lsp,kennedy.xls,lcet10.txt,plrabn12.txt,ptt5,sum,xargs.1

Shout-out to jedib0t for his wonderful go-pretty module for generating these tables and the HTML tables used in the GitHub Pages site.

Building

To build the binary from source, simply go get the package:

$ go get -u github.com/go-compression/raisin

Install the dependencies:

$ go get

And build:

$ go build

Usage as a module

To use this package as a module, simply import the engine package and use the io.Reader and io.Writer interfaces.

import (
	"fmt"
	"github.com/go-compression/raisin/engine"
)

func main() {
	text := []byte("Hello world!")

	file := engine.CompressedFile{}
	file.CompressionEngine = "arithmetic"
	file.Write(text)
	fmt.Println("Compressed:", string(file.Compressed))
}

Documentation

Documentation is available at godoc, please note that most of the code is currently undocumented as it is still a work in progress.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].