All Projects → intenthq → Anon

intenthq / Anon

Licence: mit
A UNIX Command To Anonymise Data

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Anon

J
❌ Multi-format spreadsheet CLI (now merged in http://github.com/sheetjs/js-xlsx )
Stars: ✭ 343 (+0.59%)
Mutual labels:  cli, csv, data
Csv2ofx
A Python library and command line tool for converting csv to ofx and qif files
Stars: ✭ 133 (-61%)
Mutual labels:  cli, csv, data
Csv2db
The CSV to database command line loader
Stars: ✭ 102 (-70.09%)
Mutual labels:  cli, csv
Pipedream
Connect APIs, remarkably fast. Free for developers.
Stars: ✭ 2,068 (+506.45%)
Mutual labels:  cli, data
Psql2csv
Run a query in psql and output the result as CSV.
Stars: ✭ 153 (-55.13%)
Mutual labels:  cli, csv
Q
q - Run SQL directly on CSV or TSV files
Stars: ✭ 8,809 (+2483.28%)
Mutual labels:  cli, csv
Tsv Utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Stars: ✭ 1,215 (+256.3%)
Mutual labels:  cli, csv
Riko
A Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+360.7%)
Mutual labels:  cli, data
Dataproofer
A proofreader for your data
Stars: ✭ 628 (+84.16%)
Mutual labels:  cli, csv
Jsonexport
{} → 📄 it's easy to convert JSON to CSV
Stars: ✭ 208 (-39%)
Mutual labels:  cli, csv
Gopherlabs
Go - Beginners | Intermediate | Advanced
Stars: ✭ 205 (-39.88%)
Mutual labels:  cli, data
Json 2 Csv
Convert JSON to CSV *or* CSV to JSON!
Stars: ✭ 210 (-38.42%)
Mutual labels:  cli, csv
Xsv
A fast CSV command line toolkit written in Rust.
Stars: ✭ 7,831 (+2196.48%)
Mutual labels:  cli, csv
Samples Viewer Generator
🎉 A CLI utility tool to generate web app of data visualization samples for presentation purpose
Stars: ✭ 13 (-96.19%)
Mutual labels:  cli, data
Glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
Stars: ✭ 1,341 (+293.26%)
Mutual labels:  cli, data
Structured Text Tools
A list of command line tools for manipulating structured text data
Stars: ✭ 6,180 (+1712.32%)
Mutual labels:  cli, csv
Csview
📠 A high performance csv viewer with cjk/emoji support.
Stars: ✭ 208 (-39%)
Mutual labels:  cli, csv
Things.sh
Simple read-only comand-line interface to your Things 3 database
Stars: ✭ 492 (+44.28%)
Mutual labels:  cli, csv
Trdsql
CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.
Stars: ✭ 593 (+73.9%)
Mutual labels:  cli, csv
Tably
Python command-line script for converting .csv data to LaTeX tables
Stars: ✭ 173 (-49.27%)
Mutual labels:  cli, csv

Anon — A UNIX Command To Anonymise Data

Build Status Go Report Card License GitHub release

Anon is a tool for taking delimited files and anonymising or transforming columns until the output is useful for applications where sensitive information cannot be exposed.

Installation

Releases of Anon are available as pre-compiled static binaries on the corresponding GitHub release. Simply download the appropriate build for your machine and make sure it's in your PATH (or use it directly).

Usage

anon [--config <path to config file, default is ./config.json>]
     [--output <path to output to, default is STDOUT>]

Anon is designed to take input from STDIN and by default will output the anonymised file to STDOUT:

anon < some_file.csv > some_file_anonymised.csv

Configuration

In order to be useful, Anon needs to be told what you want to do to each column of the CSV. The config is defined as a JSON file (defaults to a file called config.json in the current directory):

{
  "csv": {
    "delimiter": ","
  },
  // Optionally define a number of rows to randomly sample down to.
  // To do it, it will hash (using FNV-1 32 bits) the column with the ID
  // in it and will mod the result by the value specified to decide if the
  // row is included or not -> include = hash(idColumn) % mod == 0
  "sampling": {
    // Number used to mod the hash of the id and determine if the row
    // has to be included in the sample or not
    "mod": 30000
    // Specify in which a column a unique ID exists on which the sampling can
    // be performed. Indices are 0 based, so this would sample on the first
    // column.
    "idColumn": 0
  },
  // An array of actions to take on each column - indices are 0 based, so index
  // 0 in this array corresponds to column 1, and so on.
  //
  // There must be an action for every column in the CSV.
  "actions": [
    {
      // The no-op, leaves the input unchanged.
      "name": "nothing"
    },
    {
      // Takes a UK format postcode (eg. W1W 8BE) and just keeps the outcode
      // (eg. W1W).
      "name": "outcode"
    },
    {
      // Hash (SHA1) the input.
      "name": "hash",
      // Optional salt that will be appened to the input.
      // If not defined, a random salt will be generated
      "salt": "salt"
    },
    {
      // Given a date, just keep the year.
      "name": "year",
      "dateConfig": {
        // Define the format of the input date here.
        "format": "YYYYmmmdd"
      }
    },
    {
      // Summarise a range of values.
      "name": "range",
      "rangeConfig": {
        "ranges": [
          // For example, this will take values between 0 and 100, and convert
          // them to the string "0-100".
          // You can use one of (gt, gte) and (lt, lte) but not both at the
          // same time.
          // You also need to define at least one of (gt, gte, lt, lte).
          {
            "gte": 0,
            "lt": 100,
            "output": "0-100"
          }
        ]
      }
    }
  ]
}

Contributing

Any contribution will be welcome, please refer to our contributing guidelines for more information.

License

This project is licensed under the MIT license.

The icon is by Pixel Perfect from Flaticon, and is licensed under a Creative Commons 3.0 BY license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].