All Projects → shellbound → Jwalk

shellbound / Jwalk

Licence: mit
Streaming JSON parser for Unix

Programming Languages

shell
77523 projects
awk
318 projects

Labels

Projects that are alternatives of or similar to Jwalk

Parsrs
CSV, JSON, XML text parsers and generators written in pure POSIX shellscript
Stars: ✭ 56 (-53.72%)
Mutual labels:  json, posix
Dataclass factory
Modern way to convert python dataclasses or other objects to and from more common types like dicts or json-like structures
Stars: ✭ 116 (-4.13%)
Mutual labels:  json
Easy Canvas
小程序简单绘图,通过 json 方式绘制一张朋友圈分享图
Stars: ✭ 117 (-3.31%)
Mutual labels:  json
Jsonstore
Simple thread-safe in-memory JSON key-store with persistent backend
Stars: ✭ 118 (-2.48%)
Mutual labels:  json
Netclient Ios
Versatile HTTP Networking in Swift
Stars: ✭ 117 (-3.31%)
Mutual labels:  json
Livedata Call Adapter
A simple LiveData call adapter for retrofit
Stars: ✭ 119 (-1.65%)
Mutual labels:  json
Captagent
100% Open-Source Packet Capture Agent for HEP
Stars: ✭ 116 (-4.13%)
Mutual labels:  json
Typescript Runtime Type Benchmarks
Benchmark Comparison of Packages with Runtime Validation and TypeScript Support
Stars: ✭ 119 (-1.65%)
Mutual labels:  json
Finch
Scala combinator library for building Finagle HTTP services
Stars: ✭ 1,552 (+1182.64%)
Mutual labels:  json
Polyglot
Color, ASCII-only Git prompt for zsh, bash, ksh93, mksh, pdksh, dash, and busybox ash
Stars: ✭ 118 (-2.48%)
Mutual labels:  posix
Config Lint
Command line tool to validate configuration files
Stars: ✭ 118 (-2.48%)
Mutual labels:  json
Npoint
JSON storage bins with schema validation
Stars: ✭ 116 (-4.13%)
Mutual labels:  json
Jeayeson
A very sane (header only) C++14 JSON library
Stars: ✭ 119 (-1.65%)
Mutual labels:  json
Purescript Simple Json
A simple Purescript JSON library that uses types automatically
Stars: ✭ 117 (-3.31%)
Mutual labels:  json
Typedload
Python library to load dynamically typed data into statically typed data structures
Stars: ✭ 120 (-0.83%)
Mutual labels:  json
Groq
Specification for GROQ - Graph-Relational Object Queries
Stars: ✭ 117 (-3.31%)
Mutual labels:  json
Qqlight Websocket
QQLight机器人WebSocket-RPC插件,让你能够使用任何语言编写QQ机器人程序
Stars: ✭ 118 (-2.48%)
Mutual labels:  json
Django Admin Json Editor
Adds json-editor for JSONField in Django Administration
Stars: ✭ 118 (-2.48%)
Mutual labels:  json
Asus Fan Control
🌀 Fan control for ASUS devices running Linux.
Stars: ✭ 120 (-0.83%)
Mutual labels:  posix
Json Voorhees
A killer modern C++ library for interacting with JSON.
Stars: ✭ 120 (-0.83%)
Mutual labels:  json

jwalk

jwalk is a streaming JSON parser for Unix: streaming, in that individual JSON tokens are parsed as soon as they are read from the input stream, and for Unix, in that its tab-delimited output is designed to be used and manipulated by the standard Unix toolset.

jwalk…

  • parses large documents slowly, but steadily, in memory space proportional to the key depth of the document
  • runs from source on any contemporary POSIX system
  • is written in standard awk, sed, and sh, and does not require a C compiler or precompiled binaries
  • can easily be embedded in another project

jwalk is useful for working with data from JSON APIs in shell scripts, especially in bootstrap environments, but can be applied to a variety of other situations. It is a powerful command-line tool in its own right, with built-in pattern filtering and support for awk scripts called examiners.

How It Works

The jwalk command reads a JSON document from standard input or from a file specified as an argument.

A pipeline inside jwalk transforms the document stream into a series of tokens, and then parses the tokens into records, one record per line, on standard output.

Each record is a sequence of tab-separated fields:

  • zero or more fields, collectively the path, containing the string keys used to access the value, followed by
  • one field specifying the value's type, followed by
  • one field representing the value itself.

The type is one of number, string, boolean, null, array, or object. String values are encoded as UTF-8, and are unescaped with the exception of \n, \t, and \\.

Examples

(In this documentation, represents a tab character.)

Basic JSON values produce one record each:

$ echo '123.45' | jwalk
number ▷ 123.45

$ echo '"acab"' | jwalk
string ▷ acab

$ echo 'true' | jwalk
boolean ▷ true

$ echo 'null' | jwalk
null ▷

Arrays and objects produce one record representing the type, followed by zero or more records representing their key-value pairs:

$ echo '[80,"http"]' | jwalk
array ▷
0 ▷ number ▷ 80
1 ▷ string ▷ http

$ echo '{"version":"1.0.0"}' | jwalk
object ▷
version ▷ string ▷ 1.0.0

You can use the -l (or --leaf-only) command-line option to omit the type record:

$ echo '[80,"http"]' | jwalk -l
0 ▷ number ▷ 80
1 ▷ string ▷ http

$ echo '{"version":"1.0.0"}' | jwalk -l
version ▷ string ▷ 1.0.0

An array of objects looks like:

$ echo '[{"lat":45.1,"lng":13.6,"name":"Rovinj"},
>        {"lat":44.9,"lng":13.8,"name":"Pula"}]' | jwalk
array ▷
0 ▷ object ▷
0 ▷ lat ▷ number ▷ 45.1
0 ▷ lng ▷ number ▷ 13.6
0 ▷ name ▷ string ▷ Rovinj
1 ▷ object ▷
1 ▷ lat ▷ number ▷ 44.9
1 ▷ lng ▷ number ▷ 13.8
1 ▷ name ▷ string ▷ Pula

With -l, the same array looks like:

$ echo '[{"lat":45.1,"lng":13.6,"name":"Rovinj"},
>        {"lat":44.9,"lng":13.8,"name":"Pula"}]' | jwalk -l
0 ▷ lat ▷ number ▷ 45.1
0 ▷ lng ▷ number ▷ 13.6
0 ▷ name ▷ string ▷ Rovinj
1 ▷ lat ▷ number ▷ 44.9
1 ▷ lng ▷ number ▷ 13.8
1 ▷ name ▷ string ▷ Pula

Filtering Records By Path

You can use the -p <pattern> (or --pattern <pattern>) command-line option to instruct jwalk to print only the records whose keys match the given pattern.

A pattern describes a key or sequence of keys present anywhere in a record's path. For example:

  • name matches records whose path contains a key "name"
  • person.name matches records whose path contains the key "person" immediately followed by the key "name"

Patterns may contain any of the following special strings:

String Matches
^ the beginning of the path
$ the end of the path
. the boundary between two adjacent keys
* wildcard; zero or more occurrences of any character in a key
.** zero or more keys

To match these strings literally, escape them by placing a \ character in front. To match a literal backslash, use \\.

Example Patterns

Pattern Matches records
^a starting with the key "a"
*.* with at least two keys
a with the key "a"
(empty) with the key ""
a.b.c. with the keys "a", "b", and "c", followed by the key ""
a*c having any key which starts with a and ends with c
a.*.c with the key "a", followed by one key, followed by the key "c"
a.**.c with the key "a", followed by zero or more keys, followed by the key "c"
c$ ending with the key "c"

Specifying Multiple Patterns

If you specify multiple patterns on the command line, jwalk will print records which match any of those patterns. In other words, jwalk matches the union, or logical OR, of its pattern arguments.

Examining Records With awk

jwalk's tab-delimited, line-separated output is designed to be consumed by standard Unix tools such as awk, cut, grep, and sed.

In particular, awk's default field and record separators handle jwalk's output without any additional configuration, such that each record's fields are accessible as $1, $2, and so on:

$ echo '["awk","cut","grep","sed"]' \
>      | jwalk -l | awk '{print $3}'
awk
cut
grep
sed

A jwalk examiner is an awk script with a runtime environment tailored for parsing jwalk output. Specifically, examiners have access to special variables with details about the record.

Specify examiners on the command line by passing one or more -e <script> options to jwalk:

$ echo '["awk","cut","grep","sed"]' \
>      | jwalk -l -e '{print value}'
awk
cut
grep
sed

You can also store examiners in files and load them with the -f <scriptfile> command-line option.

Note that jwalk will not display any output unless you call awk's print built-in command, such as with -e '{print}'. Most examiners will want to print records conditionally or display them in a different format.

You can use pattern filtering in conjunction with examiners. The filtering phase happens before the examining phase, so examiners are only aware of matched records.

Special Variables

In addition to the full set of special variables available to all awk programs, examiners have access to the following additional variables:

Variable name Description
keys an array of zero or more strings, representing the key path, indexed forward starting at 1 and backward at -1
path the key path as a string, with each key separated by a tab (or FS)
key the rightmost or last key of the key path; equivalent to keys[-1]
type the type of the JSON value
leaf false when the type is array or object; true otherwise
value or _ the string representation of the JSON value

Unescaping String Values

The characters \n, \t, and \ remain escaped in special variables. Pass these variables through the unescape() function to replace the escaped characters with unescaped values.

Configuring jwalk

By default, jwalk uses the awk and sed commands found in your PATH. You can tell it to use specific commands by setting the JWALK_AWK or JWALK_SED environment variables, such as with JWALK_AWK=gawk or JWALK_SED=/usr/local/bin/gsed.

You can log the shell commands issued by jwalk to standard error by setting the JWALK_DEBUG environment variable to 1.

Installing and Embedding jwalk

To install jwalk, run bin/jwalk --install with the path to the directory where jwalk should be installed. The directory must already exist. For example:

$ sudo bin/jwalk --install /usr/local

Once you have a jwalk command installed in your path, you can run jwalk --install to embed jwalk into another project:

$ mkdir -p vendor/jwalk
$ jwalk --install vendor/jwalk
$ vendor/jwalk/bin/jwalk -l ...

To install a git checkout of jwalk for development, either place a symlink to bin/jwalk somewhere in your PATH, or place jwalk's bin directory in your PATH.

Testing jwalk

Run test/check to start the jwalk test harness. This script runs each test case in test/cases/ and logs the results in TAP format to standard output. If any test case fails, the harness exits with a non-zero status.

Input data lives in test/corpus/ and expected output lives in test/fixtures/. When writing new test cases, use the existing test cases and file hierarchy as a guide.

Contributing Back

jwalk is open-source software, freely distributable under the terms of an MIT-style license. The source code is hosted on GitHub.

We welcome contributions in the form of bug reports, pull requests, or thoughtful discussions in the GitHub issue tracker.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.


© Sam Stephenson • Part of the Shellbound Project

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].