Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → chop-dbhi → Diff Table

chop-dbhi / Diff Table

Programming Languages

31211 projects - #10 most used programming language

Labels

sql csv table events diff eventsourcing

Projects that are alternatives of or similar to Diff Table

Monitor Table Change With Sqltabledependency

Get SQL Server notification on record table change

Stars: ✭ 459 (+2085.71%)

Mutual labels: sql, table

Daff

align and compare tables

Stars: ✭ 598 (+2747.62%)

Mutual labels: csv, diff

Finviz

Unofficial API for finviz.com

Stars: ✭ 493 (+2247.62%)

Mutual labels: csv, sql

Rainbow csv

🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language

Stars: ✭ 337 (+1504.76%)

Mutual labels: csv, sql

Sheetjs

📗 SheetJS Community Edition -- Spreadsheet Data Toolkit

Stars: ✭ 28,479 (+135514.29%)

Mutual labels: csv, table

Pytablewriter

pytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.

Stars: ✭ 422 (+1909.52%)

Mutual labels: csv, table

Trdsql

CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.

Stars: ✭ 593 (+2723.81%)

Mutual labels: csv, sql

fastapi-csv

🏗️ Create APIs from CSV files within seconds, using fastapi

Stars: ✭ 46 (+119.05%)

Mutual labels: csv, table

Countries

Countries, Languages & Continents data (capital and currency, native name, calling codes).

Stars: ✭ 656 (+3023.81%)

Mutual labels: csv, sql

Municipios Brasileiros

🏡 Código IBGE, nome do município, capital, código UF, UF, estado, latitude e longitude das cidades brasileiras

Stars: ✭ 638 (+2938.1%)

Mutual labels: csv, sql

swiss-army knife for data

Stars: ✭ 275 (+1209.52%)

Mutual labels: csv, sql

Rows

A common, beautiful interface to tabular data, no matter the format

Stars: ✭ 739 (+3419.05%)

Mutual labels: csv, table

Sqawk

Like Awk but with SQL and table joins

Stars: ✭ 263 (+1152.38%)

Mutual labels: csv, sql

Excelmerge

GUI Diff Tool for Excel

Stars: ✭ 425 (+1923.81%)

Mutual labels: csv, diff

csv2latex

🔧 Simple script in python to convert CSV files to LaTeX table

Stars: ✭ 54 (+157.14%)

Mutual labels: csv, table

Datasette

An open source multi-tool for exploring and publishing data

Stars: ✭ 5,640 (+26757.14%)

Mutual labels: csv, sql

Eventflow

Async/await first CQRS+ES and DDD framework for .NET

Stars: ✭ 1,932 (+9100%)

Mutual labels: events, eventsourcing

tardis

Event sourcing toolkit

Stars: ✭ 35 (+66.67%)

Mutual labels: events, eventsourcing

World countries

Constantly updated lists of world countries and their associated alpha-2, alpha-3 and numeric country codes as defined by the ISO 3166 standard, available in CSV, JSON , PHP and SQL formats, in multiple languages and with national flags included

Stars: ✭ 598 (+2747.62%)

Mutual labels: csv, sql

Nano Sql

Universal database layer for the client, server & mobile devices. It's like Lego for databases.

Stars: ✭ 717 (+3314.29%)

Mutual labels: csv, sql

View All Similar Projects ➔

diff-table

A tool to compare two tables of data. Currently the tool supported tables from Postgres and CSV files.

The primary use case is to compare the output of a query executed at different points in time. For example, in a batch ETL process that runs every night, you can compare the previous batch with the new batch.

Install

Download a release or install from source:

go get -u github.com/chop-dbhi/diff-table/cmd/diff-table

Usage

The minimum requirement is to specify two tables (CSV or relational) and specify the key columns. For example, using the provided example data in the repository, the command would look like this.

diff-table \
  -csv1 example/file1.csv \
  -csv2 example/file2.csv \
  -key id

The default output is a JSON encoded object with a summary of the differences, including column and row changes.

{
  "total_rows": 4,
  "columns_added": [
    "city"
  ],
  "columns_dropped": [],
  "type_changes": {},
  "rows_added": 1,
  "rows_deleted": 1,
  "rows_changed": 1
}

Adding the -diff option in the command will result in row-level changes in the output.

diff-table \
  -csv1 example/file1.csv \
  -csv2 example/file2.csv \
  -key id \
  -diff

{
  "total_rows": 4,
  "columns_added": [
    "city"
  ],
  "columns_dropped": [],
  "type_changes": {},
  "rows_added": 1,
  "rows_deleted": 1,
  "rows_changed": 2,
  "row_diffs": [
    {
      "key": {
        "id": "1"
      },
      "changes": {
        "city": {
          "old": null,
          "new": "Trenton"
        },
        "color": {
          "old": "Blue",
          "new": "Teal"
        }
      }
    },
    {
      "key": {
        "id": "3"
      },
      "changes": {
        "city": {
          "old": null,
          "new": "Philadelphia"
        }
      }
    }
  ],
  "new_rows": [
    {
      "city": "Allentown",
      "color": "Black",
      "gender": "Male",
      "id": "4",
      "name": "Neal"
    }
  ],
  "deleted_rows": [
    {
      "id": "2"
    }
  ]
}

Events

The above type of output is convenient for summary usage, however in some use cases a set of events may be more useful. Using the -events option will result in the changes being streamed as they are discovered rather than aggregating everything up into a single output object.

diff-table \
  -csv1 example/file1.csv \
  -csv2 example/file2.csv \
  -key id \
  -events

The output is a newline-delimited set of JSON-encoded events. Column-based changes will always come first. The type field denotes the type of event which can be switched on during consumption. The base event structured looks as follows:

{
  // Event name/type.
  "type": "row-added",

  // Unix epoch timestamp in seconds.
  "time": 1520114848,

  // Offset of the row in the batch, if applicable. This can also be used as an
  // ordering mechanism.
  "offset": 1,

  // Key of the row this event pertains to, if applicable.
  "key": {...},

  // Full snapshot of the row. This will be present in row-added events, but
  // can be optionally included for row-changed and row-removed events using
  // the -data option.
  "data": {...},

  // Set of changes found relative to the old row. Each key is the column
  // name and the value is an object with `old` and `new` keys with the
  // respective values.
  "changes": {...},

  // If a column -added, -changed, or -removed event, this is the name of the
  // column affected.
  "column": "city"

  // If a column-changed event, this is the old type.
  "old_type": "int32",

  // If a column-changed event, this is the new type.
  "new_type": "int64"
}

{
  "type": "column-added",
  "column": "city"
}
{
  "type": "row-changed",
  "offset": 1,
  "key": {
    "id": "1"
  },
  "changes": {
    "city": {
      "old": null,
      "new": "Trenton"
    },
    "color": {
      "old": "Blue",
      "new": "Teal"
    }
  }
}
{
  "type": "row-removed",
  "offset": 2,
  "key": {
    "id": "2"
  }
}
{
  "type": "row-changed",
  "offset": 3,
  "key": {
    "id": "3"
  },
  "changes": {
    "city": {
      "old": null,
      "new": "Philadelphia"
    }
  }
}
{
  "type": "row-added",
  "offset": 4,
  "key": {
    "id": "4"
  },
  "data": {
    "city": "Allentown",
    "color": "Black",
    "gender": "Male",
    "id": "4",
    "name": "Neal"
  }
}

Snapshots

In addition to change events, snapshots are supported which scans a table and emits a row-stored event including the current state of each row. These events are structurally equivalent to row-added events which include the full data.

The use case for them is to initialize a stream from a stateful starting point, but can be used over time to make new starting points for consumers. This prevents consumers from needing to know what the starting state was when the stream was initiated as well as not needing to read from the very beginning of the stream in order to build up the current state. With snapshots, the consumer can start from the last known snapshot and consume from there.

To create a snapshot, use the -snapshot option with the newest table or CSV file specified.

diff-table \
  -csv1 example/file1.csv \
  -key id \
  -snapshot

Examples

Below are examples of how tables can be specified including SQL-based tables and CSV files (with sorted or unsorted rows) and how columns can be renamed (not in the data source, just in the diff-table runtime) before the tables are compared.

Tables in the same database

diff-table \
  -db postgres://localhost:5432/postgres \
  -table1 data_v1 \
  -table2 data_v2 \
  -key id

Tables from different servers/databases

diff-table \
  -db postgres://localhost:5432/postgres \
  -db2 postgres://localhost:5435/other \
  -table1 data_v1 \
  -table2 data_v2 \
  -key id

CSV files

Note: this assumes the CSV files are pre-sorted by the specified key columns.

diff-table \
  -csv1 data_v1.csv \
  -csv2 data_v2.csv  \
  -key id

Unsorted CSV files

diff-table \
  -csv1 data_v1.csv \
  -csv1.sort \
  -csv2 data_v2.csv  \
  -csv2.sort \
  -key id

CSV file and database table (o.O)

Note: this assumes the CSV file is pre-sorted by the specified key columns.

diff-table \
  -csv1 data_v1.csv \
  -db2 postgres://localhost:5432/postgres \
  -table2 data_v2 \
  -key id

Tables with renamed columns

The data_v1.foo column will be renamed to bar (just in the diff-table runtime, not in the source) and compared to the data_v2.bar column. The same will happen for the baz:buz column rename.

diff-table \
  -db postgres://localhost:5432/postgres \
  -table1 data_v1 \
  -rename1 "foo:bar,baz:buz" \
  -table2 data_v2 \
  -key id

SQL statements

In addition to tables, arbitrary SQL statements are supported as well. The basic requirement is that the columns specified in -key must existing the statement and the data must be ordered by the key columns.

diff-table \
  -db postgres://localhost:5432/postgres \
  -kye id \
  -sql1 "select id, col1, col2 from table1 order by id" \
  -sql1 "select id, col1, col2 from table2 order by id"

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 21

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗