All Projects → spencermountain → Efrt

spencermountain / Efrt

Licence: mit
neato compression for key-value data

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Efrt

xcdat
Fast compressed trie dictionary library
Stars: ✭ 51 (-12.07%)
Mutual labels:  compression, trie
Thmap
Concurrent trie-hash map library
Stars: ✭ 51 (-12.07%)
Mutual labels:  trie
Drv3 Tools
(Not actively maintained, use DRV3-Sharp) Tools for extracting and re-injecting files for Danganronpa V3 for PC.
Stars: ✭ 13 (-77.59%)
Mutual labels:  compression
Model Optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Stars: ✭ 992 (+1610.34%)
Mutual labels:  compression
Bcnencoder.net
Cross-platform texture encoding libary for .NET. With support for BC1-3/DXT, BC4-5/RGTC and BC7/BPTC compression. Outputs files in ktx or dds formats.
Stars: ✭ 28 (-51.72%)
Mutual labels:  compression
Libgenerics
libgenerics is a minimalistic and generic library for C basic data structures.
Stars: ✭ 42 (-27.59%)
Mutual labels:  trie
Lib
single header libraries for C/C++
Stars: ✭ 866 (+1393.1%)
Mutual labels:  compression
Jsoncrush
Compress JSON into URL friendly strings
Stars: ✭ 1,071 (+1746.55%)
Mutual labels:  compression
Sevenzipsharp
Fork of SevenZipSharp on CodePlex
Stars: ✭ 50 (-13.79%)
Mutual labels:  compression
Scarab
A system to patch your content files.
Stars: ✭ 38 (-34.48%)
Mutual labels:  compression
Zipper
🗳A library to create, read and modify ZIP archive files, written in Swift.
Stars: ✭ 38 (-34.48%)
Mutual labels:  compression
Imgsquash
Simple image compression full website code written in node, react and next.js framework. Easy to deploy as a microservice.
Stars: ✭ 948 (+1534.48%)
Mutual labels:  compression
Tris Webpack Boilerplate
A Webpack boilerplate for static websites that has all the necessary modern tools and optimizations built-in. Score a perfect 10/10 on performance.
Stars: ✭ 1,016 (+1651.72%)
Mutual labels:  compression
Iscompress
Inno Setup zlib, bzlib and lzma compression source code - see issrc repository for lzma2 compression source code.
Stars: ✭ 21 (-63.79%)
Mutual labels:  compression
Image Optimizer
Simple lossless compression for Elementary OS
Stars: ✭ 52 (-10.34%)
Mutual labels:  compression
Srec
PyTorch Implementation of "Lossless Image Compression through Super-Resolution"
Stars: ✭ 868 (+1396.55%)
Mutual labels:  compression
Sevenz4s
SevenZip library for Scala, easy to use.
Stars: ✭ 38 (-34.48%)
Mutual labels:  compression
Deno brotli
🗜 Brotli wasm module for deno
Stars: ✭ 40 (-31.03%)
Mutual labels:  compression
Goofy
Goofy - Realtime DXT1/ETC1 encoder
Stars: ✭ 58 (+0%)
Mutual labels:  compression
Genozip
Compressor for genomic files (FASTQ, SAM/BAM, VCF, FASTA, GVF, 23andMe...), up to 5x better than gzip and faster too
Stars: ✭ 53 (-8.62%)
Mutual labels:  compression
compression of key-value data
npm install efrt

if your data looks like this:

var data = {
  bedfordshire: 'England',
  aberdeenshire: 'Scotland',
  buckinghamshire: 'England',
  argyllshire: 'Scotland',
  bambridgeshire: 'England',
  cheshire: 'England',
  ayrshire: 'Scotland',
  banffshire: 'Scotland'
};

you can compress it like this:

var str = efrt.pack(data);
//'England:b0che1;ambridge0edford0uckingham0;shire|Scotland:a0banff1;berdeen0rgyll0yr0;shire'

then _very!_ quickly flip it back into:

var obj = efrt.unpack(str);
obj['bedfordshire'];//'England'

Yep,

efrt packs category-type data into a very compressed prefix trie format, so that redundancies in the data are shared, and nothing is repeated.

By doing this clever-stuff ahead-of-time, efrt lets you ship much more data to the client-side, without hassle or overhead.

The whole library is 8kb, the unpack half is barely 2kb.

it is based on:

Benchmarks!

Demo!

Basically,
  • get a js object into very compact form
  • reduce filesize/bandwidth a bunch
  • ensure the unpacking time is negligible
  • keep word-lookups on critical-path
var efrt = require('efrt')

var foods = {
  strawberry: 'fruit',
  blueberry: 'fruit',
  blackberry: 'fruit',
  tomato: ['fruit', 'vegetable'],
  cucumber: 'vegetable',
  pepper: 'vegetable'
};
var str = efrt.pack(foods);
//'{"fruit":"bl0straw1tomato;ack0ue0;berry","vegetable":"cucumb0pepp0tomato;er"}'

var obj=efrt.unpack(str)
console.log(obj.tomato)
//['fruit', 'vegetable']

or, an Array:

if you pass it an array of strings, it just creates an object with true values:

const data = [
  'january',
  'february',
  'april',
  'june',
  'july',
  'august',
  'september',
  'october',
  'november',
  'december'
]
const packd = efrt.pack(data)
// true¦a6dec4febr3j1ma0nov4octo5sept4;rch,y;an1u0;ly,ne;uary;em0;ber;pril,ugust
const sameArray = Object.keys(efrt.unpack(packd))
// same thing !

Reserved characters

the keys of the object are normalized. Spaces/unicode are good, but numbers, case-sensitivity, and some punctuation (semicolon, comma, exclamation-mark) are not (yet) supported.

specialChars = new RegExp('[0-9A-Z,;!:|¦]')

efrt is built-for, and used heavily in compromise, to expand the amount of data it can ship onto the client-side. If you find another use for efrt, please drop us a line🎈

Performance

efrt is tuned to be very quick to unzip. It is O(1) to lookup. Packing-up the data is the slowest part, which is usually fine:

var compressed = efrt.pack(skateboarders);//1k words (on a macbook)
var trie = efrt.unpack(compressed)
// unpacking-step: 5.1ms

trie.hasOwnProperty('tony hawk')
// cached-lookup: 0.02ms

Size

efrt will pack filesize down as much as possible, depending upon the redundancy of the prefixes/suffixes in the words, and the size of the list.

  • list of countries - 1.5k -> 0.8k (46% compressed)
  • all adverbs in wordnet - 58k -> 24k (58% compressed)
  • all adjectives in wordnet - 265k -> 99k (62% compressed)
  • all nouns in wordnet - 1,775k -> 692k (61% compressed)

but there are some things to consider:

  • bigger files compress further (see 🎈 birthday problem)
  • using efrt will reduce gains from gzip compression, which most webservers quietly use
  • english is more suffix-redundant than prefix-redundant, so non-english words may benefit from other styles

Assuming your data has a low category-to-data ratio, you will hit-breakeven with at about 250 keys. If your data is in the thousands, you can very be confident about saving your users some considerable bandwidth.

Use

IE9+

<script src="https://unpkg.com/[email protected]/builds/efrt.min.js"></script>
<script>
  var smaller=efrt.pack(['larry','curly','moe'])
  var trie=efrt.unpack(smaller)
  console.log(trie['moe'])
</script>

if you're doing the second step in the client, you can load just the unpack-half of the library(~3k):

npm install efrt-unpack
<script src="https://unpkg.com/[email protected]/builds/efrt-unpack.min.js"></script>
<script>
  var trie=unpack(compressedStuff);
  trie.hasOwnProperty('miles davis');
</script>

Thanks to John Resig for his fun trie-compression post on his blog, and Wiktor Jakubczyc for his performance analysis work

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].