All Projects → hellofresh → Klepto

hellofresh / Klepto

Licence: mit
Klepto is a tool for copying and anonymising data

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to Klepto

Things.sh
Simple read-only comand-line interface to your Things 3 database
Stars: ✭ 492 (+154.92%)
Mutual labels:  cli, database
Mssql Cli
A command-line client for SQL Server with auto-completion and syntax highlighting
Stars: ✭ 1,061 (+449.74%)
Mutual labels:  cli, database
Typeorm Seeding
🌱 A delightful way to seed test data into your database.
Stars: ✭ 501 (+159.59%)
Mutual labels:  cli, database
Lev
The complete REPL & CLI for managing LevelDB instances.
Stars: ✭ 295 (+52.85%)
Mutual labels:  cli, database
Dksnap
Docker Snapshots for Development and Test Data
Stars: ✭ 122 (-36.79%)
Mutual labels:  cli, database
Concourse
Distributed database warehouse for transactions, search and analytics across time.
Stars: ✭ 310 (+60.62%)
Mutual labels:  platform, database
Tb Cli
🛠️ CLI for rapid TB generate
Stars: ✭ 8 (-95.85%)
Mutual labels:  cli, platform
Cocorico
👐 Cocorico is an open source marketplace solution for services and rentals. More information right here: https://www.cocorico.io/en/ 🚀 Cocorico is also available in an off-the-shelf SaaS package, check out https://www.hatch.li to launch your platform today. 😍 We are hiring (telecommute welcome 🏡): https://www.welcometothejungle.com/en/companies/cocorico/jobs/candidatures-spontanees#apply
Stars: ✭ 765 (+296.37%)
Mutual labels:  platform, database
Csv2db
The CSV to database command line loader
Stars: ✭ 102 (-47.15%)
Mutual labels:  cli, database
Mongoaudit
🔥 A powerful MongoDB auditing and pentesting tool 🔥
Stars: ✭ 1,174 (+508.29%)
Mutual labels:  cli, database
Ronin
Ronin is a Ruby platform for vulnerability research and exploit development. Ronin allows for the rapid development and distribution of code, Exploits or Payloads, Scanners, etc, via Repositories.
Stars: ✭ 220 (+13.99%)
Mutual labels:  cli, database
Webtau
Webtau (short for web test automation) is a testing API, command line tool and a framework to write unit, integration and end-to-end tests. Test across REST-API, Graph QL, Browser, Database, CLI and Business Logic with consistent set of matchers and concepts. REPL mode speeds-up tests development. Rich reporting cuts down investigation time.
Stars: ✭ 156 (-19.17%)
Mutual labels:  cli, database
Ldb
A C++ REPL / CLI for LevelDB
Stars: ✭ 201 (+4.15%)
Mutual labels:  cli, database
Mongo Seeding
The ultimate solution for populating your MongoDB database.
Stars: ✭ 375 (+94.3%)
Mutual labels:  cli, database
App
Reusable framework for micro services & command line tools
Stars: ✭ 66 (-65.8%)
Mutual labels:  cli, database
Dynein
DynamoDB CLI written in Rust.
Stars: ✭ 126 (-34.72%)
Mutual labels:  cli, database
Autoserver
Create a full-featured REST/GraphQL API from a configuration file
Stars: ✭ 188 (-2.59%)
Mutual labels:  cli, database
Aq
Query AWS resources with SQL
Stars: ✭ 190 (-1.55%)
Mutual labels:  cli
Paper
Paper is a fast NoSQL-like storage for Java/Kotlin objects on Android with automatic schema migration support.
Stars: ✭ 2,263 (+1072.54%)
Mutual labels:  database
Rpi Backlight
🔆 A Python module for controlling power and brightness of the official Raspberry Pi 7" touch display
Stars: ✭ 190 (-1.55%)
Mutual labels:  cli

Klepto

Klepto

Build Status Go Report Card Go Doc

Klepto is a tool for copying and anonymising data

Klepto is a tool that copies and anonymises data from other sources.

Intro

Klepto helps you to keep the data in your environment as consistent as possible by copying it from another environment's database.

You can use Klepto to get production data but without sensitive customer information for your testing or local debugging.

Features

  • Copy data to your local database or to stdout, stderr
  • Filter the source data
  • Anonymise the source data

Supported Databases

  • PostgreSQL
  • MySQL

If you need to get data from a database type that you don't see here, build it yourself and add it to this list. Contributions are welcomed :)

Requirements

  • Active connection to the IT VPN
  • Latest version of pg_dump installed (Only required when working with PostgreSQL databases)

Installation

Klepto is written in Go with support for multiple platforms. Pre-built binaries are provided for the following:

  • macOS (Darwin) for x64, i386, and ARM architectures
  • Windows
  • Linux

You can download the binary for your platform of choice from the releases page.

Once downloaded, the binary can be run from anywhere. We recommend that you move it into your $PATH for easy use, which is usually at /usr/local/bin.

Usage

Klepto uses a configuration file called .klepto.toml to define your table structure. If your table is normalized, the structure can be detected automatically.

For dumping the last 10 created active users, your file will look like this:

[[Tables]]
  Name = "users"
  [Tables.Anonymise]
    email = "EmailAddress"
    username = "FirstName"
    password = "SimplePassword"
  [Tables.Filter]
    Match = "users.status = 'active'"
    Limit = 10
    [Tables.Filter.Sorts]
      created_at = "desc"

After you have created the file, run:

Postgres:

klepto steal \
--from="postgres://user:[email protected]/fromDB?sslmode=disable" \
--to="postgres://user:[email protected]/toDB?sslmode=disable" \

MySQL:

klepto steal \
--from="user:[email protected](localhost:3306)/fromDB?sslmode=disable" \
--to="user:[email protected](localhost:3306)/toDB?sslmode=disable" \

Behind the scenes Klepto will establishes the connection with the source and target databases with the given parameters passed, and will dump the tables.

Steal Options

Available options can be seen by running klepto steal --help

❯ klepto steal --help
Steals and anonymises databases

Usage:
  klepto steal [flags]

Flags:
      --concurrency int                Sets the amount of dumps to be performed concurrently (default 12)
  -c, --config string                  Path to config file (default ".klepto.toml")
  -f, --from string                    Database dsn to steal from (default "mysql://root:[email protected](localhost:3306)/klepto")
  -h, --help                           help for steal
      --read-conn-lifetime duration    Sets the maximum amount of time a connection may be reused on the read database
      --read-max-conns int             Sets the maximum number of open connections to the read database (default 5)
      --read-max-idle-conns int        Sets the maximum number of connections in the idle connection pool for the read database
      --read-timeout duration          Sets the timeout for read operations (default 5m0s)
  -t, --to string                      Database to output to (default writes to stdOut) (default "os://stdout/")
      --to-rds                         If the output server is an AWS RDS server
      --write-conn-lifetime duration   Sets the maximum amount of time a connection may be reused on the write database
      --write-max-conns int            Sets the maximum number of open connections to the write database (default 5)
      --write-max-idle-conns int       Sets the maximum number of connections in the idle connection pool for the write database
      --write-timeout duration         Sets the timeout for write operations (default 30s)

Global Flags:
  -v, --verbose   Make the operation more talkative

We recommend to always set the following parameters:

  • concurrency to alleviate the pressure over both the source and target databases.
  • read-max-conns to limit the number of open connections, so that the source database does not get overloaded.

Configuration File Options

You can set a number of keys in the configuration file. Below is a list of all configuration options, followed by some examples of specific keys.

  • Matchers - Variables to store filter data. You can declare a filter once and reuse it among tables.
  • Tables - A Klepto table definition.
    • Name - The table name.
    • IgnoreData - A flag to indicate whether data should be imported or not. If set to true, it will dump the table structure without importing data.
    • Filter - A Klepto definition to filter results.
      • Match - A condition field to dump only certain amount data. The value may be either expression or correspond to an existing Matchers definition.
      • Limit - The number of results to be fetched.
      • Sorts - Defines how the table is sorted.
    • Anonymise - Indicates which columns to anonymise.
    • Relationships - Represents a relationship between the table and referenced table.
      • Table - The table name.
      • ForeignKey - The table's foreign key.
      • ReferencedTable - The referenced table name.
      • ReferencedKey - The referenced table primary key.

IgnoreData

You can dump the database structure without importing data by setting the IgnoreData value to true.

[[Tables]]
 Name = "logs"
 IgnoreData = true

Matchers

Matchers are variables to store filter data. You can declare a filter once and reuse it among tables:

[[Matchers]]
  Latest100Users = "ORDER BY users.created_at DESC LIMIT 100"

[[Tables]]
  Name = "users"
  [Tables.Filter]
    Match = "Latest100Users"

[[Tables]]
  Name = "orders"
  [[Tables.Relationships]]
    ForeignKey = "user_id"
    ReferencedTable = "users"
    ReferencedKey = "id"
  [Tables.Filter]
    Match = "Latest100Users"

See examples for more.

Anonymise

You can anonymise specific columns in your table using the Anonymise key. Anonymisation is performed by running a Faker against the specified column.

[[Tables]]
  Name = "customers"
  [Tables.Anonymise]
    email = "EmailAddress"
    firstName = "FirstName"

[[Tables]]
  Name = "users"
  [Tables.Anonymise]
    email = "EmailAddress"
    password = "literal:1234"

This would replace these 4 columns from the customer and users tables and run fake.EmailAddress and fake.FirstName against them respectively. We can use literal:[some-constant-value] to specify a constant we want to write for a column. In this case, password = "literal:1234" would write 1234 for every row in the password column of the users table.

Available data types for anonymisation

Available data types can be found in fake.go. This file is generated from https://github.com/icrowley/fake (it must be generated because it is written in such a way that Go cannot reflect upon it).

We generate the file with the following:

$ go get github.com/ungerik/pkgreflect
$ fake master pkgreflect -notypes -novars -norecurs vendor/github.com/icrowley/fake/

Relationships

The Relationships key represents a relationship between the table and referenced table.

To dump the latest 100 users with their orders:

[[Tables]]
  Name = "users"
  [Tables.Filter]
    Limit = 100
    [Tables.Filter.Sorts]
      created_at = "desc"

[[Tables]]
  Name = "orders"
  [[Tables.Relationships]]
    # behind the scenes klepto will create a inner join between orders and users
    ForeignKey = "user_id"
    ReferencedTable = "users"
    ReferencedKey = "id"
  [Tables.Filter]
    Limit = 100
    [Tables.Filter.Sorts]
      created_at = "desc"

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

License

This project is licensed under the MIT License - see the LICENSE file for details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].