vasi / Pixz
Programming Languages
Labels
Projects that are alternatives of or similar to Pixz
pixz
Pixz (pronounced pixie) is a parallel, indexing version of xz
.
Repository: https://github.com/vasi/pixz
Downloads: https://github.com/vasi/pixz/releases
pixz vs xz
The existing XZ Utils provide great compression in the .xz
file format,
but they produce just one big block of compressed data. Pixz instead produces a collection of
smaller blocks which makes random access to the original data possible. This is especially useful
for large tarballs.
Differences to xz
-
pixz
automatically indexes tarballs during compression -
pixz
supports parallel decompression, whichxz
does not -
pixz
defaults to using all available CPU cores, whilexz
defaults to using only one core -
pixz
provides-i
and-o
command line options to specify input and output file -
pixz
does not support the command line option-z
or--compress
-
pixz
does not support the command line option-c
or--stdout
-
-f
command line option is incompatible -
-l
command line option output differs -
-q
command line option is incompatible -
-t
command line option is incompatible
Building pixz
General help about the building process's configuration step can be acquired via:
./configure --help
Dependencies
- pthreads
- liblzma 4.999.9-beta-212 or later (from the xz distribution)
- libarchive 2.8 or later
- AsciiDoc to generate the man page
Build from Release Tarball
./configure
make
make install
You many need sudo
permissions to run make install
.
Build from GitHub
git clone https://github.com/vasi/pixz.git
cd pixz
./autogen.sh
./configure
make
make install
You many need sudo
permissions to run make install
.
Usage
Single Files
Compress a single file (no tarball, just compression), multi-core:
pixz bar bar.xz
Decompress it, multi-core:
pixz -d bar.xz bar
Tarballs
Compress and index a tarball, multi-core:
pixz foo.tar foo.tpxz
Very quickly list the contents of the compressed tarball:
pixz -l foo.tpxz
Decompress the tarball, multi-core:
pixz -d foo.tpxz foo.tar
Very quickly extract a single file, multi-core, also verifies that contents match index:
pixz -x dir/file < foo.tpxz | tar x
Create a tarball using pixz for multi-core compression:
tar -Ipixz -cf foo.tpxz foo/
Specifying Input and Output
These are the same (also work for -x
, -d
and -l
as well):
pixz foo.tar foo.tpxz
pixz < foo.tar > foo.tpxz
pixz -i foo.tar -o foo.tpxz
Extract the files from foo.tpxz
into foo.tar
:
pixz -x -i foo.tpxz -o foo.tar file1 file2 ...
Compress to foo.tpxz
, removing the original:
pixz foo.tar
Extract to foo.tar
, removing the original:
pixz -d foo.tpxz
Other Flags
Faster, worse compression:
pixz -1 foo.tar
Better, slower compression:
pixz -9 foo.tar
Use exactly 2 threads:
pixz -p 2 foo.tar
Compress, but do not treat it as a tarball, i.e. do not index it:
pixz -t foo.tar
Decompress, but do not check that contents match index:
pixz -d -t foo.tpxz
List the xz blocks instead of files:
pixz -l -t foo.tpxz
For even more tuning flags, check the manual page:
man pixz
Comparison to other Tools
plzip
- about equally complex and efficient
- lzip format seems less-used
- version 1 is theoretically indexable, I think
ChopZip
- written in Python, much simpler
- more flexible, supports arbitrary compression programs
- uses streams instead of blocks, not indexable
- splits input and then combines output, much higher disk usage
pxz
- simpler code
- uses OpenMP instead of pthreads
- uses streams instead of blocks, not indexable
- uses temporary files and does not combine them until the whole file is compressed, high disk and memory usage
pbzip2
- not indexable
- appears slow
- bzip2 algorithm is non-ideal
pigz
- not indexable
dictzip, idzip
- not parallel