All Projects → KvanTTT → BaseNcoding

KvanTTT / BaseNcoding

Licence: Apache-2.0 license
Library for encoding of binary data into strings using base32, base85, base128 and other algorithms.

Programming Languages

C#
18002 projects
javascript
184084 projects - #8 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to BaseNcoding

multibase
multi base encoding/decoding utility
Stars: ✭ 15 (-64.29%)
Mutual labels:  encoding, base64, base32, base85
data-encoding
Efficient and customizable data-encoding functions in Rust
Stars: ✭ 92 (+119.05%)
Mutual labels:  base64, base32
scure-base
Secure, audited & 0-deps implementation of bech32, base64, base32, base16 & base58
Stars: ✭ 27 (-35.71%)
Mutual labels:  encoding, base64
tjson.js
JavaScript-compatible implementation of Tagged JSON (TJSON), written in TypeScript.
Stars: ✭ 53 (+26.19%)
Mutual labels:  base64, base32
universal-base64
Small universal base64 functions for node.js and browsers
Stars: ✭ 25 (-40.48%)
Mutual labels:  encoding, base64
zbase32
Human-oriented encoding for binary data
Stars: ✭ 31 (-26.19%)
Mutual labels:  encoding, zbase32
Base62x
Base62x is an alternative approach to Base 64 without symbols in output.
Stars: ✭ 38 (-9.52%)
Mutual labels:  base64, base32
Constant time encoding
Constant-Time Character Encoding in PHP Projects
Stars: ✭ 403 (+859.52%)
Mutual labels:  encoding, base64
Decodify
Detect and decode encoded strings, recursively.
Stars: ✭ 670 (+1495.24%)
Mutual labels:  encoding, base64
BeFoR64
BeFoR64, Base64 encoding/decoding library for FoRtran poor men
Stars: ✭ 17 (-59.52%)
Mutual labels:  encoding, base64
ffmpegd
FFmpeg websocket server for ffmpeg-commander.
Stars: ✭ 37 (-11.9%)
Mutual labels:  encoding
Powershell-Obfuscator
Powerful script for logical obfuscation of powershell scripts
Stars: ✭ 27 (-35.71%)
Mutual labels:  base64
cryptouri.rs
Rust implementation of CryptoURI: URN-like namespace for cryptographic objects with Bech32-based encoding
Stars: ✭ 33 (-21.43%)
Mutual labels:  encoding
convey
CSV processing and web related data types mutual conversion
Stars: ✭ 16 (-61.9%)
Mutual labels:  base64
RLPSwift
Recursive Length Prefix encoding written in Swift
Stars: ✭ 22 (-47.62%)
Mutual labels:  encoding
ffcvt
ffmpeg convert wrapper tool
Stars: ✭ 32 (-23.81%)
Mutual labels:  encoding
base64.c
Base64 Library in C
Stars: ✭ 60 (+42.86%)
Mutual labels:  base64
ImHex-Patterns
Hex patterns, include patterns and magic files for the use with the ImHex Hex Editor
Stars: ✭ 192 (+357.14%)
Mutual labels:  encoding
xquery-intellij-plugin
XQuery, XPath and XSLT Language Support for the IntelliJ IDE
Stars: ✭ 24 (-42.86%)
Mutual labels:  basex
fast-text-encoding
Fast polyfill for TextEncoder and TextDecoder, only supports UTF-8
Stars: ✭ 78 (+85.71%)
Mutual labels:  encoding

BaseNcoding

There are well-known algorithms for binary data to string encoding exist, such as algorithms with alphabet length with radix of power 2 (base32, base64) and not of power 2 (base85 and base91).

This library contains an implementation of algorithm for general case, that is a custom-length alphabet can be used. Here is online BaseNcoding DEMO.

The idea of the developed algorithm is based on base85 encoding, except block size is not constant, but it's calculation depends on an alphabet length.

Steps of algorithm

  1. Calculation of block size in bits and chars.
  2. Conversation of an input string to the byte array (using UTF8 Encoding).
  3. Splitting byte array on n-bit groups.
  4. Conversation of every group to radix n.
  5. Tail bits processing.

Mathematical justification

For optimal block size calculation the following considerations has been used:

System for an optimal block size calculation

In this system:

  • a — Length of alphabet A.
  • k — Count of encoding chars.
  • b — Digit radix (2 in most cases).
  • n — Number of bits in radix b for representing k chars of alphabet A.
  • r_values — Compression ratio (less is better).
  • r_bits — Bits per char count (greater is better).
  • mbc — Block max bits count.
  • ⌊x⌋ — The largest integer not greater than x (floor).
  • ⌈x⌉ — The smallest integer not less than x (ceiling).

See implementation in GetOptimalBitsCount method.

Tail bits processing

The most difficult problem in this project which nevertheless has been solved with only integer calculations.

Calculation of total and tail number of bits and chars during encoding

int mainBitsLength = (data.Length * 8 / BlockBitsCount) * BlockBitsCount;
int tailBitsLength = data.Length * 8 - mainBitsLength;
int totalBitsLength = mainBitsLength + tailBitsLength;
int mainCharsCount = mainBitsLength * BlockCharsCount / BlockBitsCount;
int tailCharsCount = (tailBitsLength * BlockCharsCount + BlockBitsCount - 1) / BlockBitsCount;
int totalCharsCount = mainCharsCount + tailCharsCount;
int iterationCount = mainCharsCount / BlockCharsCount;

Calculation of total and tail number of bits and chars during decoding

int totalBitsLength = ((data.Length - 1) * BlockBitsCount / BlockCharsCount + 8) / 8 * 8;
int mainBitsLength = totalBitsLength / BlockBitsCount * BlockBitsCount;
int tailBitsLength = totalBitsLength - mainBitsLength;
int mainCharsCount = mainBitsLength * BlockCharsCount / BlockBitsCount;
int tailCharsCount = (tailBitsLength * BlockCharsCount + BlockBitsCount - 1) / BlockBitsCount;
BigInteger tailBits = CharsToBits(data, mainCharsCount, tailCharsCount);
if (tailBits >> tailBitsLength != 0)
{
	totalBitsLength += 8;
	mainBitsLength = totalBitsLength / BlockBitsCount * BlockBitsCount;
	tailBitsLength = totalBitsLength - mainBitsLength;
	mainCharsCount = mainBitsLength * BlockCharsCount / BlockBitsCount;
	tailCharsCount = (tailBitsLength * BlockCharsCount + BlockBitsCount - 1) / BlockBitsCount;
}
int iterationCount = mainCharsCount / BlockCharsCount;

Experiment

Diagram of optimal block size and alphabet length dependence has been calculated with help of system above for mbc = 64:

Diagram values
a n k r_bits r_values
2 1 1 1 1
3 19 12 1.5833 1.0136
4 2 1 2 1
5 58 25 2.32 1.034
6 31 12 2.5833 1.0136
7 14 5 2.8 1.0258
8 3 1 3 1
9 19 6 3.1667 1.0136
10 63 19 3.3158 1.0842
11 38 11 3.4545 1.038
12 43 12 3.5833 1.0136
13 37 10 3.7 1.0031
14 19 5 3.8 1.0258
15 39 10 3.9 1.0489
16 4 1 4 1
17 49 12 4.0833 1.0349
18 25 6 4.1667 1.0136
19 55 13 4.2308 1.1672
20 56 13 4.3077 1.1369
21 57 13 4.3846 1.0719
22 49 11 4.4545 1.038
23 9 2 4.5 1.0332
24 55 12 4.5833 1.0136
25 51 11 4.6364 1.0588
26 47 10 4.7 1.0031
27 19 4 4.75 1.0136
28 24 5 4.8 1.0258
29 34 7 4.8571 1.0041
30 49 10 4.9 1.0489
31 64 13 4.9231 1.3237
32 5 1 5 1
33 5 1 5 1.0313
34 61 12 5.0833 1.0349
35 41 8 5.125 1.024
36 31 6 5.1667 1.0136
37 26 5 5.2 1.0333
38 47 9 5.2222 1.1739
39 58 11 5.2727 1.1015
40 53 10 5.3 1.1642
41 16 3 5.3333 1.0517
42 43 8 5.375 1.1008
43 27 5 5.4 1.0953
44 60 11 5.4545 1.038
45 60 11 5.4545 1.329
46 11 2 5.5 1.0332
47 61 11 5.5455 1.0721
48 39 7 5.5714 1.0679
49 28 5 5.6 1.0523
50 62 11 5.6364 1.0588
51 17 3 5.6667 1.012
52 57 10 5.7 1.0031
53 63 11 5.7273 1.005
54 23 4 5.75 1.0136
55 52 9 5.7778 1.0226
56 29 5 5.8 1.0258
57 64 11 5.8182 1.1187
58 41 7 5.8571 1.0041
59 47 8 5.875 1.0433
60 59 10 5.9 1.0489
61 59 10 5.9 1.2375
62 59 10 5.9 1.456
63 59 10 5.9 1.7086
64 6 1 6 1
65 6 1 6 1.0156
66 6 1 6 1.0313
67 6 1 6 1.0469
68 6 1 6 1.0625
69 61 10 6.1 1.0609
70 49 8 6.125 1.024
71 43 7 6.1429 1.034
72 37 6 6.1667 1.0136
73 37 6 6.1667 1.1011
74 31 5 6.2 1.0333
75 56 9 6.2222 1.042
76 56 9 6.2222 1.1739
77 25 4 6.25 1.0476
78 25 4 6.25 1.1031
79 63 10 6.3 1.0266
80 63 10 6.3 1.1642
81 19 3 6.3333 1.0136
82 19 3 6.3333 1.0517
83 51 8 6.375 1.0002
84 51 8 6.375 1.1008
85 32 5 6.4 1.0331
86 32 5 6.4 1.0953
87 45 7 6.4286 1.0722
88 58 9 6.4444 1.098
89 58 9 6.4444 1.2155
90 58 9 6.4444 1.3441
91 13 2 6.5 1.0109
92 13 2 6.5 1.0332
93 13 2 6.5 1.0558
94 13 2 6.5 1.0786

One can see that known base64, base85, base91 encodings have been developed in good points (minimal block size with good compression ratio).

Big numbers (C# BitInteger and JavaScript BigInt) are used for some calculations.

The algorithm also has a parallel version.

License

BaseN algorithm is licensed under the Apache 2.0 License.

More detail explanation available on Russian.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].