All Projects → microsoft → CosmicClone

microsoft / CosmicClone

Licence: MIT license
Cosmic Clone is a utility that can backup\clone\restore a azure Cosmos database Collection. It can also anonymize cosmos documents and helps hide personally identifiable data.

Programming Languages

C#
18002 projects

Projects that are alternatives of or similar to CosmicClone

syncflux
SyncFlux is an Open Source InfluxDB Data synchronization and replication tool for migration purposes or HA clusters
Stars: ✭ 145 (+28.32%)
Mutual labels:  backup-tool, backup-database
jlik.me
URL Shortener project.
Stars: ✭ 31 (-72.57%)
Mutual labels:  cosmosdb, cosmos-db
MCW-OSS-PaaS-and-DevOps
MCW OSS PaaS and DevOps
Stars: ✭ 49 (-56.64%)
Mutual labels:  cosmosdb, cosmos-db
abgleich
zfs sync tool
Stars: ✭ 22 (-80.53%)
Mutual labels:  backup-utility, backup-tool
kafka-connect-cosmosdb
Kafka Connect connectors for Azure Cosmos DB
Stars: ✭ 28 (-75.22%)
Mutual labels:  cosmos, cosmosdb
sre.surmon.me
💻 SRE service for Surmon.me blog.
Stars: ✭ 34 (-69.91%)
Mutual labels:  backup-tool, backup-database
cosmosdb-materialized-views
A full sample that shows how to implement real-time updated Materalized Views with CosmosDB, Change Feed and Azure Functions
Stars: ✭ 20 (-82.3%)
Mutual labels:  cosmosdb, cosmos-db
databricks-notebooks
Collection of Databricks and Jupyter Notebooks
Stars: ✭ 19 (-83.19%)
Mutual labels:  azure-storage, cosmos-db
Cosmos.Identity
A Cosmos storage provider for ASP.NET Core Identity.
Stars: ✭ 26 (-76.99%)
Mutual labels:  cosmos, cosmosdb
whitebox
White-box Analysis and Implementation Tools
Stars: ✭ 58 (-48.67%)
Mutual labels:  obfuscation, masking
Relocbonus
An obfuscation tool for Windows which instruments the Windows Loader into acting as an unpacking engine.
Stars: ✭ 106 (-6.19%)
Mutual labels:  obfuscation
Rtti Obfuscator
Obfuscates all RTTI (Run-time type information) inside a binary
Stars: ✭ 107 (-5.31%)
Mutual labels:  obfuscation
Yansollvm
Yet Another Not So Obfuscated LLVM
Stars: ✭ 180 (+59.29%)
Mutual labels:  obfuscation
Inline syscall
Inline syscalls made easy for windows on clang
Stars: ✭ 232 (+105.31%)
Mutual labels:  obfuscation
Masked
Mask sensitive data: replace blacklisted elements with redacted values
Stars: ✭ 103 (-8.85%)
Mutual labels:  obfuscation
Stringobfuscator
Simple header-only compile-time library for string obfuscation (C++14)
Stars: ✭ 164 (+45.13%)
Mutual labels:  obfuscation
Emojify
Obfuscate your python script by converting it to emoji icons
Stars: ✭ 99 (-12.39%)
Mutual labels:  obfuscation
Datadefender
Sensitive Data Management: Data Discovery and Anonymization toolkit
Stars: ✭ 79 (-30.09%)
Mutual labels:  obfuscation
Javascript Code Protection Example
An example of JavaScript code protection
Stars: ✭ 76 (-32.74%)
Mutual labels:  obfuscation
Wg Manager
A easy to use WireGuard dashboard and management tool
Stars: ✭ 248 (+119.47%)
Mutual labels:  obfuscation

Cosmic Clone

  1. Overview
  2. Deployment Steps
  3. Create backup copy of a collection
  4. Anonymize data of a cosmos collection
  5. Todos
  6. References
  7. Contributing

Overview

screen91

Cosmic Clone is a tool to clone\backup\restore and anonymize data in an azure Cosmos Collection. As more applications begin to use Cosmos database, self serve capabilities such as backup, restore collection have become more essential. Cosmos Clone is an attempt to create a simple utility that allows to clone a Cosmos Collection. The utility helps in below

  • Clone collections for QA, testing and other non production env.
  • Backup data of a collection.
  • Create collections with similar settings(indexes, partition, TTL etc)
  • Anonymize data through scrubbing or shuffling of sensitive data in documents.

Disclaimer: Please note this is not an official tool from the Azure Cosmos DB team, but a utility developed by an independent developer within Microsoft IT and offered on Github as a sample.

Deployment Steps

  1. Just Compile and Run the Code.
  2. Or Download a pre compiled binary from the releases section and run the “CosmicCloneUI.exe” file.
  3. For Best performance you can run the compiled code in an Azure VM that is in the same region as the source and destination Cosmos Collection.

As a prerequisite the tool needs the below

  • Install Microsoft .Net Framework 4.6.1 or higher
  • Source Cosmos collection and read only keys to its account
  • Destination Cosmos Account and its read write keys
  • If firewall is enabled for the Cosmos Account, ensure the IP address of the machine running the tool is allowed.

Create backup copy of a collection

Initial screen

screen1

Enter Source and Target connection details

screen2

If validation of the entered details fails an appropriate message is displayed below the Test Connection button.

If the access validation succeeds then the next screen shows various options for cloning of a collection.

Set migration options

screen3

All the options are checked by default but allow you to configure to optout of any.

For example: If you want to retain all the partition keys and indexes then you can keep the indexing policies and Partition keys check boxes checked. Uncheck these boxes if you do not want them to be copied.

If you do not want any of the documents to be copied but just a shell of the collection with similar settings, you can uncheck the Documents check box.

As you can observe the other check boxes for Stored procedures, User defined functions and Triggers all deal with copying code segments from collection to collection.

In the next page we move onto the Anonymization process. We will leave the anonymization discussion to the next section. For now, you can click next and initiate the cloning of the collection.

screen7

screen8

Explore the cosmos portal and one can observe the new collection created with the required settings.

Anonymize data of a cosmos collection

Post selection of cloning options as seen in the previous section, we see the below page

screen4

Here we can enter the rules and attribute details that need to be masked or sanitized.

To add a rule, click on the “Add Rule” button, a mini form to enter details is displayed.

A rule is an encapsulation of an attribute and the anonymization to be performed on it. A rule tells the Cosmic Clone tool what attribute to scrub and how.

The ‘Attribute to scrub’ represents the field that needs to scrubbed\anonymized.

The ‘Filter Query’ represents the where condition based on which this rule will be applied to various documents. If this rule must be applied to all documents, then leave this field as blank.

The ‘Scrub Type’ field provides options such as

  • Single value: Replace the attribute value with a fixed value
  • Null Value: Empty the attribute content.
  • Shuffle: Random shuffle the attribute values across all documents of the collection.
  • PartialMaskFromLeft: Masks the the attribute value partially starting from left with the given value
  • PartialMaskFromRight: Masks the attribute value partially starting from right with the given value

Sample rule1

screen5

This shuffles the Full name attribute value between all documents.

Sample rule2

screen6

To update key values of the Nested Entities you can configure an anonymization rule as above. Note the Filter Query that tells the tool to perform this operation only if the EntityType attribute of the document is an “Individual”.

Sample rule3

screen10

To mask a given attribute value partially with some text, you can use the Scrub Type options "PartialMaskFromLeft" or "PartialMaskFromRight"

Note there are options on the anonymization screen to validate, save and load these rules

Migration screen

screen7

Completion notification

screen8

Before and After anonymization

screen9

As can be inferred from above, documents will be sanitized based on rules.

Todos

  • Adapt to other Cosmos API like Graph and Cassandra apart from SQL API
  • Parellelize read and write to improve efficiency
  • Add anonymization option to mask with random values (predefined patterns and regular expressions)
  • Refactor some of the UI and utility code to improve maintainability
  • Write more tests

References

Static data masking https://docs.microsoft.com/en-us/sql/relational-databases/security/static-data-masking?view=sql-server-2017

Cosmos Data Import tool https://docs.microsoft.com/en-us/azure/cosmos-db/import-data

Cosmos Bulk executor tool https://docs.microsoft.com/en-us/azure/cosmos-db/bulk-executor-overview

Contributing

Contribution guidelines for this project

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].