All Projects → geohot → twitchctw

geohot / twitchctw

Licence: other
compression = AI

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
CTW compressor. How long before we beat gzip

gzip: 3728 bytes <-- not easy to beat actually
xz:   3656 bytes


## comparing to existing CTW implementations

# https://github.com/cberzan/ctw.git ctw_baseline
This is the authors imp?
gets 3484, so CTW can do better than gzip

Though this code has tons of hacks, notably MAXCOUNTS
If you remove MAXCOUNTS it's crap, but this could be other things
That's the annoying thing about this problem, little bug = slightly worse

And I don't understand it's tree depth

# https://github.com/fumin/ctw.git
gets 4616, err, not good

# https://github.com/gkassel/pyaixi
CTW implementation seems numerically unstable...

# https://github.com/mgbellemare/SkipCTS/tree/master/python
logprob of data is 2988 bytes with 16 context, haven't tried in compressor yet

Maybe this isn't as easy as I thought...

Update
===

It is a working compressor (run ./test.sh) but it's performance is quite bad.

Maybe it's bugs? One day we'll work through the proofs in the paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].