All Projects → andersinno → Python Database Sanitizer

andersinno / Python Database Sanitizer

Licence: mit
Python based database sanitizer for removing sensitive data from your database dumps

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Python Database Sanitizer

Gnorm
A database-first code generator for any language
Stars: ✭ 415 (+1331.03%)
Mutual labels:  mysql, postgres
Mouthful
Mouthful is a self-hosted alternative to Disqus
Stars: ✭ 681 (+2248.28%)
Mutual labels:  mysql, postgres
Rbatis
Rust ORM Framework High Performance Rust SQL-ORM(JSON based)
Stars: ✭ 482 (+1562.07%)
Mutual labels:  mysql, postgres
Sqlboiler
Generate a Go ORM tailored to your database schema.
Stars: ✭ 4,497 (+15406.9%)
Mutual labels:  mysql, postgres
Migrate
Database migrations. CLI and Golang library.
Stars: ✭ 7,712 (+26493.1%)
Mutual labels:  mysql, postgres
Gnomock
Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻
Stars: ✭ 398 (+1272.41%)
Mutual labels:  mysql, postgres
Blog
Everything about database,business.(Most for PostgreSQL).
Stars: ✭ 6,330 (+21727.59%)
Mutual labels:  mysql, postgres
Sqlpad
Web-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC
Stars: ✭ 4,113 (+14082.76%)
Mutual labels:  mysql, postgres
Vscode Sqltools
Database management for VSCode
Stars: ✭ 741 (+2455.17%)
Mutual labels:  mysql, postgres
Metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
Stars: ✭ 26,803 (+92324.14%)
Mutual labels:  mysql, postgres
Deno Nessie
A modular Deno library for PostgreSQL, MySQL, MariaDB and SQLite migrations
Stars: ✭ 381 (+1213.79%)
Mutual labels:  mysql, postgres
Schemats
Generate typescript interface definitions from SQL database schema
Stars: ✭ 799 (+2655.17%)
Mutual labels:  mysql, postgres
Sqlx
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.
Stars: ✭ 5,039 (+17275.86%)
Mutual labels:  mysql, postgres
With advisory lock
Advisory locking for ActiveRecord
Stars: ✭ 409 (+1310.34%)
Mutual labels:  mysql, postgres
Jet
Type safe SQL builder with code generation and automatic query result data mapping
Stars: ✭ 373 (+1186.21%)
Mutual labels:  mysql, postgres
Search cop
Search engine like fulltext query support for ActiveRecord
Stars: ✭ 660 (+2175.86%)
Mutual labels:  mysql, postgres
Crecto
Database wrapper and ORM for Crystal, inspired by Ecto
Stars: ✭ 325 (+1020.69%)
Mutual labels:  mysql, postgres
Rdbc
Rust DataBase Connectivity (RDBC) :: Common Rust API for database drivers
Stars: ✭ 328 (+1031.03%)
Mutual labels:  mysql, postgres
Vertx Sql Client
High performance reactive SQL Client written in Java
Stars: ✭ 690 (+2279.31%)
Mutual labels:  mysql, postgres
Xorm
Simple and Powerful ORM for Go, support mysql,postgres,tidb,sqlite3,mssql,oracle, Moved to https://gitea.com/xorm/xorm
Stars: ✭ 6,464 (+22189.66%)
Mutual labels:  mysql, postgres

Database sanitation tool

pypi travis codecov

database-sanitizer is a tool which retrieves an database dump from relational database and performs sanitation on the retrieved data according to rules defined in a configuration file. Currently the sanitation tool supports both PostgreSQL and MySQL databases.

Installation

database-sanitizer can be installed from PyPI with pip like this:

$ pip install database-sanitizer

If you are using MySQL, you need to install the package like this instead, so that additional requirements are included:

$ pip install database-sanitizer[MySQL]

Usage

Once the package has been installed, database-sanitizer can be used like this:

$ database-sanitizer <DATABASE-URL>

Command line argument DATABASE-URL needs to be provided so the tool knows how to retrieve the dump from the database. With PostgreSQL, it would be something like this:

$ database-sanitizer postgres://user:[email protected]/database

However, unless an configuration file is provided, no sanitation will be performed on the retrieved database dump, which leads us to the next section which will be...

Configuration

Rules for the sanitation can be given in a configuration file written in YAML. Path to the configuration file is then given to the command line utility with --config argument (-c for shorthand) like this:

$ database-sanitizer -c config.yml postgres://user:[email protected]/database

The configuration file uses following kind of syntax:

config:
  addons:
    - some.other.package
    - yet.another.package
  extra_parameters: # These parameters will be passed to the dump tool CLI
    mysqldump:
      - "--single-transaction" # Included by default
    pg_dump:
      - "--exclude-table=something"
strategy:
  user:
    first_name: name.first_name
    last_name: name.last_name
    secret_key: string.empty
  access_log: skip_rows

In the example configuration above, there are first listed two "addon packages", which are names of Python packages where the sanitizer will be looking for sanitizer functions. They are completely optional and can be omitted, in which case only sanitizer functions defined in package called sanitizers and built-in sanitizers will be used instead.

It's also possible to define extra parameters to pass to the dump tool ( mysqldump or pg_dump). By default, mysqldump will include the --single-transaction extra parameter. You can disable this by defining the extra parameters in the config file explicitly, e.g. with an empty array [].

The strategy portion of the configuration contains the actual sanitation rules. First you define name of the database table (in the example that would be user) followed by column names in that table which each one mapped to sanitation function name. The name of the sanitation function consists from two parts separated from each other by a dot: Python module name and name of the actual function, which will be prefixed with sanitize_, so name.first_name would be a function called sanitize_first_name in a file called name.py.

Table content can be left out completely from the sanitized dump by setting table strategy to skip_rows (check access_log table in the example config). This will leave out all INSERT INTO (MySQL) or COPY (PostgreSQL) statements from the sanitized dump file. CREATE TABLE statements will not be removed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].