All Projects → dgraham → Json Stream

dgraham / Json Stream

Licence: mit
A streaming JSON parser that generates SAX-like events.

Programming Languages

ruby
36898 projects - #4 most used programming language

Labels

Projects that are alternatives of or similar to Json Stream

Gelatin
Transform text files to XML, JSON, or YAML
Stars: ✭ 150 (-5.06%)
Mutual labels:  json
Node Jsonpointer
JSON Pointer (RFC6901) implementation for Node.js
Stars: ✭ 155 (-1.9%)
Mutual labels:  json
Backtalk
HTTP/Websockets API microframework
Stars: ✭ 157 (-0.63%)
Mutual labels:  json
World Cup.json
Free open public domain football data for the world cups in JSON incl. Russia 2018 and more - No API key required ;-)
Stars: ✭ 152 (-3.8%)
Mutual labels:  json
Rbbjson
Flexible JSON traversal for rapid prototyping.
Stars: ✭ 155 (-1.9%)
Mutual labels:  json
Feedparser
feedparser gem - (universal) web feed parser and normalizer (XML w/ Atom or RSS, JSON Feed, HTML w/ Microformats e.g. h-entry/h-feed or Feed.HTML, Feed.TXT w/ YAML, JSON or INI & Markdown, etc.)
Stars: ✭ 156 (-1.27%)
Mutual labels:  json
Jose2go
Golang (GO) implementation of Javascript Object Signing and Encryption specification
Stars: ✭ 150 (-5.06%)
Mutual labels:  json
Json2react
Use JSON to create React Components.
Stars: ✭ 158 (+0%)
Mutual labels:  json
Dhallj
Dhall for Java
Stars: ✭ 154 (-2.53%)
Mutual labels:  json
Restinstance
Robot Framework library for RESTful JSON APIs
Stars: ✭ 157 (-0.63%)
Mutual labels:  json
Aeromock
Lightweight mock web application server
Stars: ✭ 152 (-3.8%)
Mutual labels:  json
Poison
An incredibly fast, pure Elixir JSON library
Stars: ✭ 1,898 (+1101.27%)
Mutual labels:  json
Browser Extension Json Discovery
Browser (Chrome, Firefox) extension for JSON discovery
Stars: ✭ 157 (-0.63%)
Mutual labels:  json
Logstash Logback Encoder
Logback JSON encoder and appenders
Stars: ✭ 1,987 (+1157.59%)
Mutual labels:  json
Serialize
Stars: ✭ 159 (+0.63%)
Mutual labels:  json
I18next Gettext Converter
converts gettext .mo or .po to 18next json format and vice versa
Stars: ✭ 150 (-5.06%)
Mutual labels:  json
Orjson
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
Stars: ✭ 2,595 (+1542.41%)
Mutual labels:  json
Serializer
With the Serializer component it's possible to handle serializing data structures, including object graphs, into array structures or other formats like XML and JSON. It can also handle deserializing XML and JSON back to object graphs.
Stars: ✭ 2,021 (+1179.11%)
Mutual labels:  json
Moshi Lazy Adapters
A collection of simple JsonAdapters for Moshi.
Stars: ✭ 158 (+0%)
Mutual labels:  json
Helios
A purely functional JSON library for Kotlin built on Λrrow
Stars: ✭ 157 (-0.63%)
Mutual labels:  json

JSON::Stream

JSON::Stream is a JSON parser, based on a finite state machine, that generates events for each state change. This allows streaming both the JSON document into memory and the parsed object graph out of memory to some other process.

This is much like an XML SAX parser that generates events during parsing. There is no requirement for the document, or the object graph, to be fully buffered in memory. This is best suited for huge JSON documents that won't fit in memory. For example, streaming and processing large map/reduce views from Apache CouchDB.

Usage

The simplest way to parse is to read the full JSON document into memory and then parse it into a full object graph. This is fine for small documents because we have room for both the document and parsed object in memory.

require 'json/stream'
json = File.read('/tmp/test.json')
obj = JSON::Stream::Parser.parse(json)

While it's possible to do this with JSON::Stream, we really want to use the json gem for documents like this. JSON.parse() is much faster than this parser, because it can rely on having the entire document in memory to analyze.

For larger documents we can use an IO object to stream it into the parser. We still need room for the parsed object, but the document itself is never fully read into memory.

require 'json/stream'
stream = File.open('/tmp/test.json')
obj = JSON::Stream::Parser.parse(stream)

Again, while JSON::Stream can be used this way, if we just need to stream the document from disk or the network, we're better off using the yajl-ruby gem.

Huge documents arriving over the network in small chunks to an EventMachine receive_data loop is where JSON::Stream is really useful. Inside an EventMachine::Connection subclass we might have:

def post_init
  @parser = JSON::Stream::Parser.new do
    start_document { puts "start document" }
    end_document   { puts "end document" }
    start_object   { puts "start object" }
    end_object     { puts "end object" }
    start_array    { puts "start array" }
    end_array      { puts "end array" }
    key            { |k| puts "key: #{k}" }
    value          { |v| puts "value: #{v}" }
  end
end

def receive_data(data)
  begin
    @parser << data
  rescue JSON::Stream::ParserError => e
    close_connection
  end
end

The parser accepts chunks of the JSON document and parses up to the end of the available buffer. Passing in more data resumes the parse from the prior state. When an interesting state change happens, the parser notifies all registered callback procs of the event.

The event callback is where we can do interesting data filtering and passing to other processes. The above example simply prints state changes, but imagine the callbacks looking for an array named rows and processing sets of these row objects in small batches. Millions of rows, streaming over the network, can be processed in constant memory space this way.

Alternatives

Development

$ bin/setup
$ bin/rake test

License

JSON::Stream is released under the MIT license. Check the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].