All Projects → pikkr → Pikkr

pikkr / Pikkr

Licence: mit
JSON parser which picks up values directly without performing tokenization in Rust

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Pikkr

Simdjson
Parsing gigabytes of JSON per second
Stars: ✭ 15,115 (+2506.03%)
Mutual labels:  json, json-parser, simd
Jsonparser
One of the fastest alternative JSON parser for Go that does not require schema
Stars: ✭ 4,323 (+645.34%)
Mutual labels:  json, json-parser
Jsoncons
A C++, header-only library for constructing JSON and JSON-like data formats, with JSON Pointer, JSON Patch, JSON Schema, JSONPath, JMESPath, CSV, MessagePack, CBOR, BSON, UBJSON
Stars: ✭ 400 (-31.03%)
Mutual labels:  json, json-parser
Pysimdjson
Python bindings for the simdjson project.
Stars: ✭ 432 (-25.52%)
Mutual labels:  json, simd
Mojojson
A simple and fast JSON parser.
Stars: ✭ 271 (-53.28%)
Mutual labels:  json, json-parser
Easyjson
Fast JSON serializer for golang.
Stars: ✭ 3,512 (+505.52%)
Mutual labels:  json, json-parser
Jstream
Streaming JSON parser for Go
Stars: ✭ 427 (-26.38%)
Mutual labels:  json, json-parser
Thorsserializer
C++ Serialization library for JSON
Stars: ✭ 241 (-58.45%)
Mutual labels:  json, json-parser
Swiftyjson
The better way to deal with JSON data in Swift.
Stars: ✭ 21,042 (+3527.93%)
Mutual labels:  json, json-parser
Simd Json
Rust port of simdjson
Stars: ✭ 499 (-13.97%)
Mutual labels:  json, simd
Argonaut
Purely functional JSON parser and library in scala.
Stars: ✭ 501 (-13.62%)
Mutual labels:  json, json-parser
Fastjson
A fast JSON parser/generator for Java.
Stars: ✭ 23,997 (+4037.41%)
Mutual labels:  json, json-parser
Oj
Optimized JSON
Stars: ✭ 2,824 (+386.9%)
Mutual labels:  json, json-parser
Bad json parsers
Exposing problems in json parsers of several programming languages.
Stars: ✭ 351 (-39.48%)
Mutual labels:  json, json-parser
Whc model
iOS平台高效转换引擎json->model,model->json,model->Dictionary,支持模型类继承其他模型类,支持指定路径转换,不区分json的key和模型属性名称大小写,自动处理json中null
Stars: ✭ 244 (-57.93%)
Mutual labels:  json, json-parser
Jtc
JSON processing utility
Stars: ✭ 425 (-26.72%)
Mutual labels:  json, json-parser
Simdjsonsharp
C# bindings for lemire/simdjson (and full C# port)
Stars: ✭ 506 (-12.76%)
Mutual labels:  json, simd
Json Dry
🌞 JSON-dry allows you to serialize & revive objects containing circular references, dates, regexes, class instances,...
Stars: ✭ 214 (-63.1%)
Mutual labels:  json, json-parser
Flatcc
FlatBuffers Compiler and Library in C for C
Stars: ✭ 434 (-25.17%)
Mutual labels:  json, json-parser
Coolie
Coolie(苦力) helps you to create models (& their constructors) from a JSON file.
Stars: ✭ 508 (-12.41%)
Mutual labels:  json, json-parser

Pikkr

Crates.io version shield Build Status

JSON parser which picks up values directly without performing tokenization in Rust

Abstract

Pikkr is a JSON parser which picks up values directly without performing tokenization in Rust. This JSON parser is implemented based on Y. Li, N. R. Katsipoulakis, B. Chandramouli, J. Goldstein, and D. Kossmann. Mison: a fast JSON parser for data analytics. In VLDB, 2017.

This JSON parser extracts values from a JSON record without using finite state machines (FSMs) and performing tokenization. It parses JSON records in the following procedures:

  1. [Indexing] Creates an index which maps logical locations of queried fields to their physical locations by using SIMD instructions and bit manipulation.
  2. [Basic parsing] Finds values of queried fields by scanning a JSON record using the index created in the previous process and learns their logical locations (i.e. pattern of the JSON structure) in the early stages.
  3. [Speculative parsing] Speculates logical locations of queried fields by using the learned result information, jumps directly to their physical locations and extracts values in the later stages. Fallbacks to basic parsing if the speculation fails.

This JSON parser performs well when there are a limited number of different JSON structural variants in a JSON data stream or JSON collection, and that is a common case in data analytics field.

Please read the paper mentioned in the opening paragraph for the details of the JSON parsing algorithm.

Performance

Benchmark Result

Hardware

Model Name: MacBook Pro
Processor Name: Intel Core i7
Processor Speed: 3.3 GHz
Number of Processors: 1
Total Number of Cores: 2
L2 Cache (per Core): 256 KB
L3 Cache: 4 MB
Memory: 16 GB

Rust

$ cargo --version
cargo 0.23.0-nightly (34c0674a2 2017-09-01)

$ rustc --version
rustc 1.22.0-nightly (d93036a04 2017-09-07)

Crates

JSON Data

Benchmark Code

Example

Code

extern crate pikkr;

fn main() {
    let queries = vec![
        "$.f1".as_bytes(),
        "$.f2.f1".as_bytes(),
    ];
    let train_num = 2; // Number of records used as training data
                       // before Pikkr starts speculative parsing.
    let mut p = match pikkr::Pikkr::new(&queries, train_num) {
        Ok(p) => p,
        Err(err) => panic!("There was a problem creating a JSON parser: {:?}", err.kind()),
    };
    let recs = vec![
        r#"{"f1": "a", "f2": {"f1": 1, "f2": true}}"#,
        r#"{"f1": "b", "f2": {"f1": 2, "f2": true}}"#,
        r#"{"f1": "c", "f2": {"f1": 3, "f2": true}}"#, // Speculative parsing starts from this record.
        r#"{"f2": {"f2": true, "f1": 4}, "f1": "d"}"#,
        r#"{"f2": {"f2": true, "f1": 5}}"#,
        r#"{"f1": "e"}"#
    ];
    for rec in recs {
        match p.parse(rec.as_bytes()) {
            Ok(results) => {
                for result in results {
                    print!("{} ", match result {
                        Some(result) => String::from_utf8(result.to_vec()).unwrap(),
                        None => String::from("None"),
                    });
                }
                println!();
            },
            Err(err) => println!("There was a problem parsing a record: {:?}", err.kind()),
        }
    }
    /*
    Output:
        "a" 1
        "b" 2
        "c" 3
        "d" 4
        None 5
        "e" None
    */
}

Build

$ cargo --version
cargo 0.23.0-nightly (34c0674a2 2017-09-01) # Make sure that nightly release is being used.
$ RUSTFLAGS="-C target-cpu=native" cargo build --release

Run

$ ./target/release/[package name]
"a" 1
"b" 2
"c" 3
"d" 4
None 5
"e" None

Documentation

Restrictions

  • Rust nightly channel and CPUs with AVX2 are needed to build Rust source code which depends on Pikkr and run the executable binary file because Pikkr uses AVX2 Instructions.

Contributing

Any kind of contribution (e.g. comment, suggestion, question, bug report and pull request) is welcome.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].