All Projects → rheinwerk-verlag → pganonymize

rheinwerk-verlag / pganonymize

Licence: other
A commandline tool for anonymizing PostgreSQL databases

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to pganonymize

database-anonymizer
CLI tool an PHP library to anonymize data in various databases
Stars: ✭ 23 (+15%)
Mutual labels:  gdpr, anonymization, dsgvo
Marathon
[DEPRECATED] Marathon makes it easy to write, run and manage your Swift scripts 🏃
Stars: ✭ 1,889 (+9345%)
Mutual labels:  developer-tools, command-line-tool
Artisan Menu
📝 Artisan Menu - Use Artisan via an elegant console GUI
Stars: ✭ 141 (+605%)
Mutual labels:  developer-tools, command-line-tool
data-migrator
A declarative data-migration package
Stars: ✭ 15 (-25%)
Mutual labels:  gdpr, anonymization
Suitcase
A flexible command line tool for instantly deploying user interfaces for simple commands and scripts.
Stars: ✭ 1,287 (+6335%)
Mutual labels:  developer-tools, command-line-tool
Check It Out
A command line interface for Git Checkout. See branches available for checkout.
Stars: ✭ 127 (+535%)
Mutual labels:  developer-tools, command-line-tool
hugo-component-matomo
Matomo user tracking and optout scripts for Hugo
Stars: ✭ 38 (+90%)
Mutual labels:  gdpr, dsgvo
Circleci Cli
Use CircleCI from the command line
Stars: ✭ 297 (+1385%)
Mutual labels:  developer-tools, command-line-tool
pynonymizer
A universal tool for translating sensitive production database dumps into anonymized copies.
Stars: ✭ 58 (+190%)
Mutual labels:  gdpr, anonymization
cookie-consent-js
A simple dialog and framework to handle the German and EU law about cookies in a website (December 2021)
Stars: ✭ 55 (+175%)
Mutual labels:  gdpr, dsgvo
kodex
A privacy and security engineering toolkit: Discover, understand, pseudonymize, anonymize, encrypt and securely share sensitive and personal data: Privacy and security as code.
Stars: ✭ 70 (+250%)
Mutual labels:  gdpr, anonymization
Swiff
💁 Command line tools for common local ↔ remote server tasks.
Stars: ✭ 87 (+335%)
Mutual labels:  developer-tools, command-line-tool
Gita
Manage many git repos with sanity 从容管理多个git库
Stars: ✭ 865 (+4225%)
Mutual labels:  developer-tools, command-line-tool
Poodle
🔥 A fast and beautiful command line tool to build API requests.
Stars: ✭ 129 (+545%)
Mutual labels:  developer-tools, command-line-tool
Stylesync
A command line tool to extract shared styles from a Sketch document, and generate native code for any platform.
Stars: ✭ 382 (+1810%)
Mutual labels:  developer-tools, command-line-tool
Dnote
A simple command line notebook for programmers
Stars: ✭ 2,192 (+10860%)
Mutual labels:  developer-tools, command-line-tool
bootstrap-cookie-consent-settings
A modal dialog (cookie banner) and framework to handle the German and EU law about cookies in a website. Needs Bootstrap.
Stars: ✭ 25 (+25%)
Mutual labels:  gdpr, dsgvo
AppIconSetGen
Tool to generate App Icon set for iOS, macOS, watchOS apps
Stars: ✭ 20 (+0%)
Mutual labels:  developer-tools, command-line-tool
droxy
Create commands on your machine that proxy docker run / exec calls
Stars: ✭ 12 (-40%)
Mutual labels:  developer-tools, command-line-tool
pgantomizer
Anonymize data in your PostgreSQL dabatase with ease
Stars: ✭ 95 (+375%)
Mutual labels:  gdpr, anonymization

pganonymize

A commandline tool to anonymize PostgreSQL databases for DSGVO/GDPR purposes.

It uses a YAML file to define which tables and fields should be anonymized and provides various methods of anonymization. The tool requires a direct PostgreSQL connection to perform the anonymization.

PyPI - Python Version license pypi Download count build pganonymize

Features

  • Intentionally compatible with Python 2.7 (for old, productive platforms)
  • Anonymize PostgreSQL tables on data level entry with various providers (some examples in the table below)
  • Exclude data for anonymization depending on regular expressions or SQL WHERE clauses
  • Truncate entire tables for unwanted data
Field Value Provider Output
first_name John choice (Bob|Larry|Lisa)
title Dr. clear  
street Irving St faker.street_name Miller Station
password dsf82hFxcM mask XXXXXXXXXX
email [email protected] md5 0cba00ca3da1b283a57287bcceb17e35
email [email protected] faker.unique.email [email protected]
phone_num 65923473 md5 as_number: True 3948293448
ip 157.50.1.20 set 127.0.0.1
uuid_col 00010203-0405-...... uuid4 f7c1bd87-4d....
  • Note: faker.unique.[provider] only supported on Python 3.6+ (Faker library min. supported python version)
  • Note: uuid4 - only for (native uuid4) columns

See the documentation for a more detailed description of the provided anonymization methods.

Installation

The default installation method is to use pip:

$ pip install pganonymize

Usage

usage: pganonymize [-h] [-v] [-l] [--schema SCHEMA] [--dbname DBNAME]
               [--user USER] [--password PASSWORD] [--host HOST]
               [--port PORT] [--dry-run] [--dump-file DUMP_FILE]

Anonymize data of a PostgreSQL database

optional arguments:
-h, --help            show this help message and exit
-v, --verbose         Increase verbosity
-l, --list-providers  Show a list of all available providers
--schema SCHEMA       A YAML schema file that contains the anonymization
                        rules
--dbname DBNAME       Name of the database
--user USER           Name of the database user
--password PASSWORD   Password for the database user
--host HOST           Database hostname
--port PORT           Port of the database
--dry-run             Don't commit changes made on the database
--dump-file DUMP_FILE
                        Create a database dump file with the given name
--init-sql INIT_SQL   SQL to run before starting anonymization

Despite the database connection values, you will have to define a YAML schema file, that includes all anonymization rules for that database. Take a look at the schema documentation or the YAML sample schema.

Example calls:

$ pganonymize --schema=myschema.yml \
    --dbname=test_database \
    --user=username \
    --password=mysecret \
    --host=db.host.example.com \
    -v

$ pganonymize --schema=myschema.yml \
    --dbname=test_database \
    --user=username \
    --password=mysecret \
    --host=db.host.example.com \
    --init-sql "set search_path to non_public_search_path; set work_mem to '1GB';" \
    -v

Database dump

With the --dump-file argument it is possible to create a dump file after anonymizing the database. Please note, that the pg_dump command from the postgresql-client-common library is necessary to create the dump file for the database, e.g. under Linux:

$ sudo apt-get install postgresql-client-common

Example call:

$ pganonymize --schema=myschema.yml \
    --dbname=test_database \
    --user=username \
    --password=mysecret \
    --host=db.host.example.com \
    --dump-file=/tmp/dump.gz \
    -v

Docker

If you want to run the anonymizer within a Docker container you first have to build the image:

$ docker build -t pganonymize .

After that you can pass a schema file to the container, using Docker volumes, and call the anonymizer:

$ docker run \
    -v <path to your schema>:/schema.yml \
    -it pganonymize \
    /usr/local/bin/pganonymize \
    --schema=/schema.yml \
    --dbname=<database> \
    --user=<user> \
    --password=<password> \
    --host=<host> \
    -v
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].