CROWDs
Conflict-free Reinterpretable Ordered Washed Data (Secure) - Delta based CRDT with additional abilities.
Key Properties
Conflict-free
- Any states can be merged without conflicts.
- Convergence (Strong Eventual Consistency).
- Merge result is independent of merge order.
- Merge is semilattice.
Reinterpretable
- Same state can be reinterpreted as any Type.
- Type of data can be changed dynamicaly without data migration.
- Cross-merge between different types is available.
Ordered
- Every data have a stable place in the document.
- Wiped data inside some Head stays tombstone to hold place.
- Interleaving-free.
Washed
- Wiped data comptely removes from state.
- Past state can't be reproduced. Snapshots/layers/changelog should be used for this.
- Small footprint. Metadata size ~= 4x-8x user data size.
- Garbage collection isn't required.
Data
- All deltas are idempotent.
- Closest to user data as more as possible.
- Every word is just one chunk.
- Delta is simply slice of full state.
- Deltas can be merged together to reduce transmit size.
Secure
- Every chunk can be crypto signed separately.
- Every peer checks signs and rejects incorrect chunks.
- Every chunk can be encrypted.
- Conflict-free merge avaailable without decrypt.
- Merging doesn't invalidate signs or decrypt data.
Vocabulary
- Doc - Full CROWD document (direct graph) which consists of real Chunks and virtual Nodes over them.
- Node - A single subtree which represented by few chunks with same Self in different Heads.
- Chunk - Minimal atomic chunk of data with metadata. Actually it's edge between Nodes. And it's extended CvRDT LWW-Register.
- Self - Node id
- Head - Parent Node id.
- Prev - Previous Node id in the siblings list.
- Next - Next Node id in the siblings list.
- Peer - Global unique identifier of independent actor.
- Time - Monotonic version.
- Data - Any JSON data.
- Sign - Crypto sign of whole Chunk data.
- Delta - Difference of two Doc state as list of Chunks.
- Clock - Vector clock. Dictionary which maps Peer to Time.
- Token - Minimal meaningfull part of text (single word + punctuation + one space).
- Point - Place inside Chunk. Usefull for caret.
- Range - Range between two Points. Usefull for selection.
- Offset - Count of letters from beginning.
- Channel - Geter/Setter method.
foo()
- read.foo(123)
- write. Write returns written.
Internals
State/Delta Format
type Chunk = {
head: number
self: number
prev: number
next: number
peer: number
time: number
data: unknown
sign: null | Uint8Array & { length: 64 }
}
type State = Chunk[]
type Delta = readonly Chunk[]
Internally Chunks may be stored in RDBMS. Example:
CREATE TABLE chunks (
head uint(6),
self uint(6),
prev uint(6),
next uint(6),
peer uint(6),
time uint(4),
data json,
sign byte(64),
)
Single Chunk structure
Primary key for Chunks: [ Head, Self ]
Data Types Representation
Atomic JSON Registry
Single value store. Just CvRDT LWW-Register.
$hyoo_crowd_reg
value( next?: unknown )
Channel for raw value. Returnsnull
by default.bool( next?: boolean )
Channel forboolean
value. Returnsfalse
by default.numb( next?: number )
Channel fornumber
value. Returns0
by default.str( next?: string )
Channel forstring
value. Returns""
by default.
Mergeable Struct
Struct is completely virtual thing. No one Chunk is stored for it. Only for field values (except it's structs too etc).
Lookup agorithm
- Make derived Head by formula:
field_head = hash_48bit( field_name, struct_self )
So all Peers writes to the same Node when uses the same key.
$hyoo_crowd_struct
sub( key: string )
Returns inner Node for field name.
Mergeable Ordered List
Properties
- New Chunk is created for every item.
- Left precedence. Position of item relies on left item, then right.
- No interleaving. Sequence of left-to-right inserted items will stay together after merge.
- Removed item is remain as tombstone for ordering purposes.
Ordering Algorithm
- Input: Head value.
- Select all Chunks with given Head.
- Make queue as sorted found Chunks by Time asc, Peer asc.
- Make empty list for result.
- Iterate over all queue while it isn't empty.
- If Prev == 0, then place it at the begin.
- If Prev != 0, then locate existen Prev in the result list.
- If Prev is located, place after that.
- if Prev isn't located, then check Next:
- If Next == 0, then place it at the end.
- If Next != 0, then locate existen Prev in the result list.
- If Next is located, place before that.
- if Next isn't located, then skip chunk and proceed followed.
- If chunk is placed remove it from queue and start from begin of queue.
$hyoo_crowd_list
list( next?: unknown[] )
Channel for list of raw values. Usesinsert
to replace content.insert( next?: unknown[], from?, to? )
Replaces range of items with reconciliation. Appends to the end when range isn't defined.
Mergeable Ordered Dictionary
sub( key: string )
Returns inner Node for key.list()
Returns list of keys.
It's both Struct and List:
- As list it contains keys.
- As struct it stores every key by derived Head.
So, every key is Node for value.
Mergeable Plain Text
Under the hood, text is just List of Tokens. So, entering word letter by letter changes same Chunk instead of creating new.
Properties
- Can be simply bound to native
<textarea>
. - Merge never produces unreadable token value. Only one of valid (LWW).
- No interleaving. The typed text will not be interrupted after merging.
- For
3.2MB
text (320k words) of "War and Peace" in CROWD Doc takes up40MB
(12x
) in JSON serialization and25MB
(8x
) in binary with signing (obsoleted data, sign was 32B but now 64B).
Online sandbox
Write Algorithm
- Input: new text and range of existen text.
- Locate Tokens which relate to the range.
- Before and after new text appen substrings of first and last tokens which should be untouched.
- Split new text using universal tokinizer.
- Reconciliate list of tokens unsing list insertion algorithm.
$hyoo_crowd_text
text( next?: string )
Channel for text representation of List. Useswrite
to replace content.write( next?: string, from?, to? )
Replaces range of text with reconciliation. Writes to the end when range isn't defined.
Mergeable Rich Text
Under the hood, tokens are stored in the same form as in plain text. There may be elements between them in form { tag: 'div' }
, which can contain the same content. Every token is represented as SPAN. Every DOM element has id
equal to Chunk Self. This is
is using to reuse existing Chunks and track Nodes moving.
$hyoo_crowd_dom
dom( next?: Element | DocumentFragment )
Channel for DOM representation of subtree.
$hyoo_crowd_html
html( next?: string )
Channel for XHTML serialization of DOM.
Mergeable Document
root
Returns root Node with Head = 0.delta( clock )
Returns delta between past clock and now.apply( delta )
Merges delta to current state.toJSON()
Returns full state dump.fork( peer: number )
Makes independent clone with another Peer for testing purposes.
Delta Algorithm
- Input: Clock, received from Peer.
- Iterate over all Chunk in Doc.
- Skip Chunks which Time less then Clock Time for same Peer.
- Return all remainig Chunks ordered by Time.
Example with SQL:
SELECT *
FROM chunks
WHERE
NOT( peer = 1 AND time <= 123 )
AND NOT( peer = 2 AND time <= 456 )
AND NOT( peer = 3 AND time <= 789 )
...
ORDER BY
time ASC,
peer ASC
Apply Algorithm
- Input: list of Chunks.
- Iterate over Chunks from Delta.
- Locate Chunk from Doc with same Head and Self.
- If Chunk doesn't exists, add Chunk to Doc.
- If Chunk exists and Time of new Chunk is greater, replace old by new.
- If Chunk exists and Time of new Chunk is same, but Peer is greater, replace old by new.
- Otherwise skip this Chunk.
Reinterpretations
✅ Expected behaviour.⭕ Unexpected but acceptable behaviour.❌ Unacceptable behaviour in most cases.
What\As | Atom | Struct | List | Dictionary | Text | DOM |
---|---|---|---|---|---|---|
Atom | ||||||
Struct | ||||||
List | ||||||
Dictionary | ||||||
Text | ||||||
DOM |
Binary Serialization
$hyoo_crowd_chunk_pack( chunk )
- Pack Chunk to binary.$hyoo_crowd_chunk_unpack( binary )
- Unpack Chunk from binary.
Use $mol_crypto to generate key-pair, sign packed Chunk and verify it.
Usage Example
// // Usage from NPM. Isn't required in MAM.
// import {
// $hyoo_crowd_doc,
// $hyoo_crowd_reg,
// $hyoo_crowd_list,
// $hyoo_crowd_text,
// } from 'hyoo_crowd_lib'
// Create document
const base = new $hyoo_crowd_doc();
// Make independent forks for testng
const alice = base.fork(1);
const bob = base.fork(2);
const carol = base.fork(3);
// Twice change register named "foo"
alice.root.sub("foo", $hyoo_crowd_reg).str("A1");
alice.root.sub("foo", $hyoo_crowd_reg).str("A2");
// Change register named "foo"
// Then converts it to sequence and insert some values
bob.root.sub("foo", $hyoo_crowd_reg).str("B1");
bob.root.sub("foo", $hyoo_crowd_list).insert(["B2", "B3"]);
// Replace text named "foo"
carol.root.sub("foo", $hyoo_crowd_text).text("C1 C2");
// Make deltas
const alice_delta = alice.delta(base.clock);
const bob_delta = bob.delta(base.clock);
const carol_delta = carol.delta(base.clock);
// Cross merge all of them
alice.apply(bob_delta).apply(carol_delta);
bob.apply(alice_delta).apply(carol_delta);
carol.apply(bob_delta).apply(alice_delta);
console.log(
["C1 ", "C2", "B1", "B2", "B3", "A2"],
alice.root.sub("foo", $hyoo_crowd_list).list(),
bob.root.sub("foo", $hyoo_crowd_list).list(),
carol.root.sub("foo", $hyoo_crowd_list).list()
);
Comparison of CRDT Libraries
$hyoo_crowd | Automerge | YJS | delta-crdt | |
---|---|---|---|---|
Approach | delta-state | op-log | delta-state | delta-state |
Garbage Collection | Doesn't required | Stores full history | Enabled by default | |
Changes signing | ||||
Merge without decrypt | ||||
Gzipped Bundle Size | 9 KB | 60 KB | 23 KB | 43 KB |
Sequence: 500 Push + 500 Shift Perf | 17 ms | 280 ms | 36 ms | |
Sequence: 500 Push + 500 Shift Mem | 80 KB | 2_100 KB | 12 KB | |
Text: 500 Append + 500 Crop Perf | 22 ms | 370 ms | 31 ms | |
Text: 500 Append + 500 Crop Mem | 80 KB | 3_300 KB | 13 KB |
Benchmarks
Sequence: Push + Shift
Chrome 92
FireFox 91
Text: Append + Crop
Chrome 89
FireFox 91
crdt-benchmarks
Support the Project
- Leave us a feedback in duscussions section.
- Fund $hyoo_guild to drive open source to the future.
- Hire nin-jin as a contractor for your collaborative app.