All Projects → EgorBo → Simdjsonsharp

EgorBo / Simdjsonsharp

Licence: apache-2.0
C# bindings for lemire/simdjson (and full C# port)

Projects that are alternatives of or similar to Simdjsonsharp

Simdjson
Parsing gigabytes of JSON per second
Stars: ✭ 15,115 (+2887.15%)
Mutual labels:  json, simd, avx2
simdjson-rs
Rust version of lemire's SimdJson
Stars: ✭ 18 (-96.44%)
Mutual labels:  simd, avx2
simdutf8
SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (-15.81%)
Mutual labels:  simd, avx2
positional-popcount
Fast C functions for the computing the positional popcount (pospopcnt).
Stars: ✭ 47 (-90.71%)
Mutual labels:  simd, avx2
std find simd
std::find simd version
Stars: ✭ 19 (-96.25%)
Mutual labels:  simd, avx2
ultra-sort
DSL for SIMD Sorting on AVX2 & AVX512
Stars: ✭ 29 (-94.27%)
Mutual labels:  simd, avx2
Turbo-Histogram
Fastest Histogram Construction
Stars: ✭ 44 (-91.3%)
Mutual labels:  simd, avx2
Guided Missile Simulation
Guided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (-93.48%)
Mutual labels:  simd, avx2
awesome-simd
A curated list of awesome SIMD frameworks, libraries and software
Stars: ✭ 39 (-92.29%)
Mutual labels:  simd, avx2
Simd Json
Rust port of simdjson
Stars: ✭ 499 (-1.38%)
Mutual labels:  json, simd
simdutf
Unicode routines (UTF8, UTF16): billions of characters per second.
Stars: ✭ 108 (-78.66%)
Mutual labels:  simd, avx2
sliceslice-rs
A fast implementation of single-pattern substring search using SIMD acceleration.
Stars: ✭ 66 (-86.96%)
Mutual labels:  simd, avx2
Turbo-Transpose
Transpose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-90.12%)
Mutual labels:  simd, avx2
cpuwhat
Nim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Stars: ✭ 25 (-95.06%)
Mutual labels:  simd, avx2
simd-byte-lookup
SIMDized check which bytes are in a set
Stars: ✭ 23 (-95.45%)
Mutual labels:  simd, avx2
utf8
Fast UTF-8 validation with range algorithm (NEON+SSE4+AVX2)
Stars: ✭ 60 (-88.14%)
Mutual labels:  simd, avx2
Fastbase64
SIMD-accelerated base64 codecs
Stars: ✭ 309 (-38.93%)
Mutual labels:  simd, avx2
ternary-logic
Support for ternary logic in SSE, XOP, AVX2 and x86 programs
Stars: ✭ 21 (-95.85%)
Mutual labels:  simd, avx2
block-aligner
SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Stars: ✭ 58 (-88.54%)
Mutual labels:  simd, avx2
Highway
Performance-portable, length-agnostic SIMD with runtime dispatch
Stars: ✭ 301 (-40.51%)
Mutual labels:  simd, avx2

SimdJsonSharp: Parsing gigabytes of JSON per second

C# version of lemire/simdjson (by Daniel Lemire and Geoff Langdale - https://arxiv.org/abs/1902.08318) fully ported from C to C#, I tried to keep the same format and API). The library accelerates JSON parsing and minification using SIMD instructions (AVX2). C# version uses System.Runtime.Intrinsics API.

UPD: Now it's also available as a set of pinvokes on top of the native lib as a .NETStandard 2.0 library, thus there are two implementations:

  1. 1.5.0 Fully managed netcoreapp3.0 library (100% port from C to C#)
  2. 1.7.0 netstandard2.0 library with native lib (bindings are generated via xoofx/CppAst)

Benchmarks

The following benchmark compares SimdJsonSharp with .NET Core 3.0 Utf8JsonReader, Json.NET and SpanJson libraries. Test json files can be found here.

1. Parse doubles

Open canada.json and parse all coordinates as System.Double:

|          Method |     fileName |    fileSize |      Mean | Ratio |
|---------------- |------------- |-------------|----------:|------:|
|        SimdJson |  canada.json | 2,251.05 Kb |  4,733 ms |  1.00 |
|  Utf8JsonReader |  canada.json | 2,251.05 Kb | 56,692 ms | 11.98 |
|         JsonNet |  canada.json | 2,251.05 Kb | 70,078 ms | 14.81 |
|    SpanJsonUtf8 |  canada.json | 2,251.05 Kb | 54,878 ms | 11.60 |

2. Count all tokens

|            Method |           fileName |    fileSize |         Mean | Ratio |
|------------------ |------------------- |------------ |-------------:|------:|
|          SimdJson | apache_builds.json |   127.28 Kb |     99.28 us |  1.00 |
|    Utf8JsonReader | apache_builds.json |   127.28 Kb |    226.42 us |  2.28 |
|           JsonNet | apache_builds.json |   127.28 Kb |    461.30 us |  4.64 |
|      SpanJsonUtf8 | apache_builds.json |   127.28 Kb |    168.08 us |  1.69 |
|                   |                    |             |              |       |
|          SimdJson |        canada.json | 2,251.05 Kb |  4,494.44 us |  1.00 |
|    Utf8JsonReader |        canada.json | 2,251.05 Kb |  6,308.01 us |  1.40 |
|           JsonNet |        canada.json | 2,251.05 Kb | 67,718.12 us | 15.06 |
|      SpanJsonUtf8 |        canada.json | 2,251.05 Kb |  6,679.82 us |  1.49 |
|                   |                    |             |              |       |
|          SimdJson |  citm_catalog.json | 1,727.20 Kb |  1,572.78 us |  1.00 |
|    Utf8JsonReader |  citm_catalog.json | 1,727.20 Kb |  3,786.10 us |  2.41 |
|           JsonNet |  citm_catalog.json | 1,727.20 Kb |  5,903.38 us |  3.75 |
|      SpanJsonUtf8 |  citm_catalog.json | 1,727.20 Kb |  3,021.13 us |  1.92 |
|                   |                    |             |              |       |
|          SimdJson | github_events.json |    65.13 Kb |     46.01 us |  1.00 |
|    Utf8JsonReader | github_events.json |    65.13 Kb |    113.80 us |  2.47 |
|           JsonNet | github_events.json |    65.13 Kb |    214.01 us |  4.65 |
|      SpanJsonUtf8 | github_events.json |    65.13 Kb |     89.09 us |  1.94 |
|                   |                    |             |              |       |
|          SimdJson |     gsoc-2018.json | 3,327.83 Kb |  2,209.42 us |  1.00 |
|    Utf8JsonReader |     gsoc-2018.json | 3,327.83 Kb |  4,010.10 us |  1.82 |
|           JsonNet |     gsoc-2018.json | 3,327.83 Kb |  6,729.44 us |  3.05 |
|      SpanJsonUtf8 |     gsoc-2018.json | 3,327.83 Kb |  2,759.59 us |  1.25 |
|                   |                    |             |              |       |
|          SimdJson |   instruments.json |   220.35 Kb |    257.78 us |  1.00 |
|    Utf8JsonReader |   instruments.json |   220.35 Kb |    594.22 us |  2.31 |
|           JsonNet |   instruments.json |   220.35 Kb |    980.42 us |  3.80 |
|      SpanJsonUtf8 |   instruments.json |   220.35 Kb |    409.47 us |  1.59 |
|                   |                    |             |              |       |
|          SimdJson |      truenull.json |    12.00 Kb |  16,032.6 ns |  1.00 |
|    Utf8JsonReader |      truenull.json |    12.00 Kb |  58,365.2 ns |  3.64 |
|           JsonNet |      truenull.json |    12.00 Kb |  60,977.3 ns |  3.80 |
|      SpanJsonUtf8 |      truenull.json |    12.00 Kb |  24,069.2 ns |  1.50 |

3. Json minification:

|                Method |           fileName |    fileSize |         Mean | Ratio |
|---------------------- |------------------- |------------ |-------------:|------:|
|  SimdJsonNoValidation | apache_builds.json |   127.28 Kb |     186.8 us |  1.00 |
|              SimdJson | apache_builds.json |   127.28 Kb |     262.5 us |  1.41 |
|               JsonNet | apache_builds.json |   127.28 Kb |   1,802.6 us |  9.65 |
|                       |                    |             |              |       |
|  SimdJsonNoValidation |        canada.json | 2,251.05 Kb |   4,130.7 us |  1.00 |
|              SimdJson |        canada.json | 2,251.05 Kb |   7,940.7 us |  1.92 |
|               JsonNet |        canada.json | 2,251.05 Kb | 181,884.0 us | 44.06 |
|                       |                    |             |              |       |
|  SimdJsonNoValidation |  citm_catalog.json | 1,727.20 Kb |   2,346.9 us |  1.00 |
|              SimdJson |  citm_catalog.json | 1,727.20 Kb |   4,064.0 us |  1.75 |
|               JsonNet |  citm_catalog.json | 1,727.20 Kb |  34,831.0 us | 14.84 |

Usage

The C# API is not stable yet and currently fully copies the original C-style API thus it involves some Unsafe magic including pointers.

Add nuget package SimdJsonSharp.Managed (for .NET Core 3.0) or SimdJsonSharp.Bindings for a .NETStandard 2.0 package (.NET 4.x, .NET Core 2.x, etc).

dotnet add package SimdJsonSharp.Bindings
or
dotnet add package SimdJsonSharp.Managed

The following sample parses a file and iterate numeric tokens

byte[] bytes = File.ReadAllBytes(somefile);
fixed (byte* ptr = bytes) // pin bytes while we are working on them
using (ParsedJson doc = SimdJson.ParseJson(ptr, bytes.Length))
using (var iterator = doc.CreateIterator())
{
    while (iterator.MoveForward())
    {
        if (iterator.GetTokenType() == JsonTokenType.Number)
            Console.WriteLine("integer: " + iterator.GetInteger());
    }
}

UPD: for SimdJsonSharp.Bindings types are postfixed with 'N', e.g. ParsedJsonN

As you can see the API looks similiar to Utf8JsonReader that was introduced recently in .NET Core 3.0

Also it's possible to just validate JSON or minify it (remove whitespaces, etc):

string someJson = ...;
string minifiedJson = SimdJson.MinifyJson(someJson);

Requirements

  • AVX2 enabled CPU
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].