All Projects → mhkeller → Joiner

mhkeller / Joiner

Licence: mit
A simple utility for SQL-like joins with Json, GeoJson or dbf data in Node, the browser and on the command line. Also creates join reports so you can know how successful a given join was. Try it in the browser -->

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Joiner

World countries
Constantly updated lists of world countries and their associated alpha-2, alpha-3 and numeric country codes as defined by the ISO 3166 standard, available in CSV, JSON , PHP and SQL formats, in multiple languages and with national flags included
Stars: ✭ 598 (+1172.34%)
Mutual labels:  csv, sql
Csv To Geojson
Convert a CSV to GeoJSON
Stars: ✭ 46 (-2.13%)
Mutual labels:  csv, geojson
Countries
World countries in JSON, CSV, XML and Yaml. Any help is welcome!
Stars: ✭ 5,379 (+11344.68%)
Mutual labels:  csv, geojson
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-12.77%)
Mutual labels:  csv, sql
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+1934.04%)
Mutual labels:  csv, sql
Datasette
An open source multi-tool for exploring and publishing data
Stars: ✭ 5,640 (+11900%)
Mutual labels:  csv, sql
Countries
Countries, Languages & Continents data (capital and currency, native name, calling codes).
Stars: ✭ 656 (+1295.74%)
Mutual labels:  csv, sql
Sqawk
Like Awk but with SQL and table joins
Stars: ✭ 263 (+459.57%)
Mutual labels:  csv, sql
Diff Table
Stars: ✭ 21 (-55.32%)
Mutual labels:  csv, sql
Csvq
SQL-like query language for csv
Stars: ✭ 804 (+1610.64%)
Mutual labels:  csv, sql
Finviz
Unofficial API for finviz.com
Stars: ✭ 493 (+948.94%)
Mutual labels:  csv, sql
Province City China
🇨🇳最全最新中国【省、市、区县、乡镇街道】json,csv,sql数据
Stars: ✭ 993 (+2012.77%)
Mutual labels:  csv, sql
Rainbow csv
🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language
Stars: ✭ 337 (+617.02%)
Mutual labels:  csv, sql
Trdsql
CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.
Stars: ✭ 593 (+1161.7%)
Mutual labels:  csv, sql
Sq
swiss-army knife for data
Stars: ✭ 275 (+485.11%)
Mutual labels:  csv, sql
Municipios Brasileiros
🏡 Código IBGE, nome do município, capital, código UF, UF, estado, latitude e longitude das cidades brasileiras
Stars: ✭ 638 (+1257.45%)
Mutual labels:  csv, sql
Mapshaper
Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files
Stars: ✭ 2,813 (+5885.11%)
Mutual labels:  csv, geojson
Sqlparser
Simple SQL parser meant for querying CSV files
Stars: ✭ 249 (+429.79%)
Mutual labels:  csv, sql
Nano Sql
Universal database layer for the client, server & mobile devices. It's like Lego for databases.
Stars: ✭ 717 (+1425.53%)
Mutual labels:  csv, sql
Csvquery
A handy SQL runner to work with CSV files
Stars: ✭ 32 (-31.91%)
Mutual labels:  csv, sql

Joiner

Build Status NPM version npm js-standard-style

A simple utility for SQL-like joins with Json or geoJson data in Node, the browser and on the command line. Also creates join reports so you can know how successful a given join is.

Try it in the browser --> https://mhkeller.github.io/join.report/

var data = [
  { "id": "1", "name": "UT" },
  { "id": "4", "name": "NM" }
]

var newData = [
  { "state_name": "NM", "avg_temp": 45 }
]

var joinedData = joiner({
  leftData: data,
  leftDataKey: 'name',
  rightData: newData,
  rightDataKey: 'state_name'
})

console.log(joinedData)
/*
{ data:
  [ { id: '1', name: 'UT', avg_temp: null },
    { id: '4', name: 'NM', avg_temp: 45 }
  ],
  report:
    { diff:
      { a: [ 'NM', 'UT' ],
        b: [ 'NM' ],
        a_and_b: [ 'NM' ],
        a_not_in_b: [ 'UT' ],,
        b_not_in_a: []
      },
     prose:
      { summary: '1 row matched in A and B. 1 row in A not in B. All 1 row in B in A.',
        full: 'Matches in A and B: NM. A not in B: UT.' } } }
*/

Examples

See the examples folder for different file formats and options. Joiner is useful to verify whether all of your joins were successful and to spot any patterns among fields that didn't join properly. For example, you can see that the county_01 row in dataset A didn't match with the county_1 in dataset B and that you have a zero-padding issue going on.

Installation

To install as a Node.js module:

npm install --save joiner

Or to install as a command-line utility:

npm install joiner -g

To use as both, run both commands.

Methods

All joins return an object with the following structure:

data: [data object],
report: {
	diff: {
		a: [data in A],
		b: [data in B],
		a_and_b: [data in A and B],
		a_not_in_b: [data in A not in B],
		b_not_in_a: [data in B not in A]
	}:
	prose: {
		summary: [summary description of join result, number of matches in A and B, A not in B, B not in A.]
		full:    [full list of which rows were joined in each of the above categories]
	}
}

joiner(config)

Perform a left join on the two array of object json datasets. It performs a deep clone using lodash.clonedeep of the objects you pass in and returns the new object.

Optionally, you can pass in a key name under nestKey in case the left data's attribute dictionary is nested under another key, such as in geoJson when it's under the properties object. More on that below.

parameter type description
leftData Array existing data
leftDataKey [String] key to join on, defaults to "id" if not set and geoJson: true
rightData Array new data
rightDataKey String key to join on
geoJson [Boolean] default=false optional, key name holding attribute
nestKey [String] optional, key name holding attribute, feaults to "properties" if not set and geoJson: true

Joining to geoJson

If geoJson is true, performs a left join onto the properties object of each feature in a geoJson array.

If you want to join on the "id" property, omit leftDataKey. If you want to join on a value in the properties object, set leftDataKey to 'properties.<desired-key-name>' and set nestKey to 'properties'. See examples for more.

Command line interface

Usage: joiner -a DATASET_A_PATH -k DATASET_A_KEY -b DATASET_B_PATH -j DATASET_B_KEY -o OUT_FILE_PATH [-r (summary|full) -n NEST_KEY --geojson]

Options:
  -h, --help     Display help           [default: false]
  -a, --apath    Dataset A path
  -k, --akey     Dataset A key
  -b, --bpath    Dataset B path
  -j, --bkey     Dataset B key
  -g, --geojson  Is dataset A geojson?  [default: false]
  -n, --nestkey  Nested key name
  -o, --out      Out path
  -r, --report   Report format          [default: "summary"]

In most cases, the first four parameters (--apath, --akey, --bpath and --bkey) are required. --akey is not required if you have set geojson to true by using -g or --geojson since it will join on the "id" field. If you want to join on a property field in geojson, then set that using --akey.

If you specify an output file, it will write the join result to the specified file and the report to the same directory. Intermediate directories will be created if they don't already exist. For example, -o path/to/output.csv will also write -o path/to/output-report.json and create the to/ folder if it isn't already there. If you don't specify an output file, it will print the results to the console.

If you don't specify an output file with -o, Joiner will print the join report to the console. By default, it will just specify the summary report. To print the full report, specify -d full.

Setting -g or --geojson is the equivalent of setting geojson: true above.

It converts the specified input file into json and writes the joined dataset to file using indian ocean, which currently supports the following formats: json, geojson, csv, psv, tsv and dbf. The format is inferred from the file extension of the input and output file paths. For example, -a path/to/input/file.csv will read in a csv and -o path/to/output/file.csv will write a csv.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].