All Projects → kibitan → masking

kibitan / masking

Licence: MIT license
Command line tool for generating anonymizing database from existed database

Programming Languages

ruby
36898 projects - #4 most used programming language
shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to masking

enhanced-privacy-m1
Magento 1 Enhanced Privacy extension for easier compliance with GDPR. Allows customers to delete, anonymize, or export their personal data.
Stars: ✭ 34 (-49.25%)
Mutual labels:  gdpr
permissionsql
🔏 Middleware for keeping track of users, login states and permissions
Stars: ✭ 58 (-13.43%)
Mutual labels:  mariadb
avatar-privacy
GDPR-conformant avatar handling for WordPress
Stars: ✭ 15 (-77.61%)
Mutual labels:  gdpr
mailbox
📨 簡易電子報發送系統,使用 #Golang 實作,send campaign mail with open, click tracker.
Stars: ✭ 26 (-61.19%)
Mutual labels:  mariadb
risorse-gdpr
Raccolta di risorse sul GDPR
Stars: ✭ 20 (-70.15%)
Mutual labels:  gdpr
iabtcf-es
Official compliant tool suite for implementing the Transparency and Consent Framework (TCF) v2.0. The essential toolkit for CMPs.
Stars: ✭ 102 (+52.24%)
Mutual labels:  gdpr
laravel-adminer
Adminer database management tool for your Laravel application.
Stars: ✭ 45 (-32.84%)
Mutual labels:  mariadb
authelia
Instructions and configuration files to deploy Authelia in Unraid OS using Docker + FreeIPA LDAP.
Stars: ✭ 116 (+73.13%)
Mutual labels:  mariadb
upscheme
Database migrations and schema updates made easy
Stars: ✭ 737 (+1000%)
Mutual labels:  rdbms
newsql nosql library
整理12种数据库相关资料,mysql,mariaDB,Percona Server,MongoDB,Redis,RocksDB,TiDB,CouchDB,Cassandra,TokuDB,MemDB,Oceanbase
Stars: ✭ 270 (+302.99%)
Mutual labels:  mariadb
CVE-2021-27928
CVE-2021-27928 MariaDB/MySQL-'wsrep provider' 命令注入漏洞
Stars: ✭ 53 (-20.9%)
Mutual labels:  mariadb
DBTestCompare
Application to compare results of two SQL queries
Stars: ✭ 15 (-77.61%)
Mutual labels:  mariadb
Docker-Stack
This repo contains a simple Docker setup with minimal configuration and only few files you can drop into many PHP-based projects.
Stars: ✭ 31 (-53.73%)
Mutual labels:  mariadb
proca
Widget to transform your website into a cutting-edge campaign in 10 min. multi-lingual, privacy first.
Stars: ✭ 29 (-56.72%)
Mutual labels:  gdpr
docker-mariadb
A docker image to run MariaDB with XtraBackup 🐳
Stars: ✭ 12 (-82.09%)
Mutual labels:  mariadb
SteamTracking-GDPR
📜 Tracking Valve's GDPR related pages
Stars: ✭ 21 (-68.66%)
Mutual labels:  gdpr
havengrc
☁️Haven GRC - easier governance, risk, and compliance 👨‍⚕️👮‍♀️🦸‍♀️🕵️‍♀️👩‍🔬
Stars: ✭ 83 (+23.88%)
Mutual labels:  gdpr
Hemmelig.app
Keep your sensitive information out of chat logs, emails, and more with encrypted secrets.
Stars: ✭ 183 (+173.13%)
Mutual labels:  gdpr
Hermes-Secure-Email-Gateway
Hermes Secure Email Gateway is a Free Open Source Ubuntu 18.04 or 20.04 Server based Email Gateway that provides Spam, Virus and Malware protection, full in-transit and at-rest email encryption as well as email archiving. It features the latest email authentication techniques such as SPF, DKIM and DMARC.
Stars: ✭ 35 (-47.76%)
Mutual labels:  mariadb
concrete
Concrete ecosystem is a set of crates that implements Zama's variant of TFHE. In a nutshell, fully homomorphic encryption (FHE), allows you to perform computations over encrypted data, allowing you to implement Zero Trust services.
Stars: ✭ 575 (+758.21%)
Mutual labels:  gdpr

MasKING🤴

CircleCI Acceptance Test MySQL Status Acceptance Test MariaDB Status

codecov Maintainability Gem Version

The command line tool for anonymizing database records by parsing a SQL dump file and build a new SQL dump file with masking sensitive/credential data.

Installation

gem install masking

Requirement

  • Ruby 2.5/2.6/2.7/3.0(preview)

Supported RDBMS

  • MySQL: 5.51, 5.6, 5.7, 8.0
  • MariaDB: 5.5, 10.02, 10.1, 10.2, 10.3, 10.4

Usage

  1. Setup configuration for anonymizing target tables/columns to masking.yml

      # table_name:
      #   column_name: masked_value
    
      users:
        string: anonymized string
        email: anonymized+%{n}@example.com # %{n} will be replaced with sequential number
        integer: 12345
        float: 123.45
        boolean: true
        null_column: null
        date: 2018-08-24
        time: 2018-08-24 15:54:06
        binary_or_blob: !binary | # Binary Data Language-Independent Type for YAML™ Version 1.1: http://yaml.org/type/binary.html
          R0lGODlhDAAMAIQAAP//9/X17unp5WZmZgAAAOfn515eXvPz7Y6OjuDg4J+fn5
          OTk6enp56enmlpaWNjY6Ojo4SEhP/++f/++f/++f/++f/++f/++f/++f/++f/+
          +f/++f/++f/++f/++f/++SH+Dk1hZGUgd2l0aCBHSU1QACwAAAAADAAMAAAFLC
          AgjoEwnuNAFOhpEMTRiggcz4BNJHrv/zCFcLiwMWYNG84BwwEeECcgggoBADs=

    A value will be implicitly converted to a compatible type. If you prefer to explicitly convert, you could use a tag as defined in YAML Version 1.1

      not-date: !!str 2002-04-28

    String should be matched with MySQL String Type. Integer/Float should be matched with MySQL Numeric Type. Date/Time should be matched with MySQL Date and Time Type.

    NOTE: MasKING doesn't check actual schema's type from the dump. If you put incompatible value it will cause an error during restoring to the database.

  2. Dump database with anonymizing

    MasKING works with mysqldump --complete-insert

      mysqldump --complete-insert -u USERNAME DATABASE_NAME | masking > anonymized_dump.sql
  3. Restore from the anonymized dump file

      mysql -u USERNAME ANONYMIZED_DATABASE_NAME < anonymized_dump.sql

    Tip: If you don't need to have an anonymized dump file, you can directly insert it from the stream. It can be faster because it has less IO interaction.

      mysqldump --complete-insert -u USERNAME DATABASE_NAME | masking | mysql -u USERNAME ANONYMIZED_DATABASE_NAME

options

$ masking -h
Usage: masking [options]
    -c, --config=FILE_PATH           specify config file. default: masking.yml
    -v, --version                    version

Use case of anonymized (production) database

  • Analyzing production databases for BI, Machine Learning, troubleshooting with respecting GDPR

  • Stress test / Integration test

  • Performance optimization for slow query

    The analyzing slow query often needs a similar amount of records/cardinality with production, the anonymized database help to analyze and tune the slow query.

  • Simulating database migration

    Some schema migration locks table and it causes trouble during the execution. With a smaller amount of database, the migration will finish in a short time and easy to overlook the problem. With the anonymized production database, it is easy to simulate the migration as the real release and makes it easy to find the problem.

  • Better feature development flow

    Using similar data with the production database makes better development experience. It makes easy to find out the things which should be changed/fixed. Also, some bugs are related to unexpected data in production, it makes easy to find them too.

  • And… your idea here!

Development

git clone [email protected]:kibitan/masking.git
bin/setup

You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install.

boot

  bundle exec exe/masking

Run test & rubocop & notes

  bundle exec rake

acceptance test

./acceptance/run_test.sh

available option via environment variable:

  • MYSQL_HOST: database host(default: localhost)
  • MYSQL_USER: mysql user name(default: mysqluser}
  • MYSQL_PASSWORD: password for user(default: password)
  • MYSQL_DBNAME: database name(default: mydb)
with docker-compose
docker-compose -f docker-compose.yml -f docker-compose/mysql80.yml run -e MYSQL_HOST=mysql80 app acceptance/run_test.sh

or

docker-compose/acceptance_test.sh mysql80

The docker-compose file names for other database versions, specify that file.

Markdown lint

bundle exec mdl *.md

Development with Docker

docker build . -t masking
echo "sample stdout" | docker run -i masking
docker run masking -v

Profiling

use bin/masking_profile

 $ cat your_sample.sql | bin/masking_profile
flat result is saved at /your/repo/profile/flat.txt
graph result is saved at /your/repo/profile/graph.txt
graph html is saved at /your/repo/profile/graph.html

 $ open profile/flat.txt

see also: ruby-prof/ruby-prof: ruby-prof: a code profiler for MRI rubies

Benchmark

use benchmark/run.rb

$ benchmark/run.rb
       user     system      total        real
   1.103012   0.009460   1.112472 (  1.123093)

Design Concept

KISS ~ keep it simple, stupid ~

No connection to the database, No handling files, Only dealing with stdin/stdout. ~ Do One Thing and Do It Well ~

No External Dependency

Depend on only pure language standard libraries, no external libraries

Future Todo

  • Pluggable/customizable for a mask way e.g. integrate with Faker
  • Compatible with other RDBMS e.g. PostgreSQL, Oracle, SQL Server
  • Parse the schema type information and validate target columns value
  • Performance optimization
    • Write in the streaming process
    • rewrite by another language?
  • Well-documentation

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/kibitan/masking. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Masking project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

1: MySQL 5.5 is already not supported by official

2: MariaDB 10.0 is already not supported by official

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].