All Projects → catalyst → Moodle Local_datacleaner

catalyst / Moodle Local_datacleaner

Reduce, filter, and anonymize moodle data for non-prod environments

Projects that are alternatives of or similar to Moodle Local datacleaner

Intellij jahia plugin
Jahia's definitions.cnd files syntax highlighting, code completion, and other amazing stuff
Stars: ✭ 19 (+58.33%)
Mutual labels:  plugin
Hxd Plugin
Handy HxD plugin for various conversions like base64
Stars: ✭ 24 (+100%)
Mutual labels:  plugin
Feflow
🚀 A command line tool aims to improve front-end engineer workflow and standard, powered by TypeScript.
Stars: ✭ 942 (+7750%)
Mutual labels:  plugin
Wdl Ol
Enhanced version of Cockos' iPlug - A simple-to-use C++ framework for developing cross platform audio plugins and targeting multiple plugin APIs with the same code. VST / VST3 / Audiounit / RTAS / AAX (Native) formats supported. NOTE: THIS IS OBSOLETE, PLEASE SEE IPLUG2:
Stars: ✭ 906 (+7450%)
Mutual labels:  plugin
Rebar3 clojerl
rebar3 Clojerl compiler plugin
Stars: ✭ 23 (+91.67%)
Mutual labels:  plugin
Genesis Simple Hook Guide
WordPress plugin that displays names of all Genesis hooks on the current page dynamically.
Stars: ✭ 25 (+108.33%)
Mutual labels:  plugin
Runconfigurationasaction
Provides a way to use IntelliJ run configurations as buttons
Stars: ✭ 17 (+41.67%)
Mutual labels:  plugin
Vue Js Grid
🍱 Vue.js 2.x responsive grid system with smooth sorting, drag-n-drop and reordering
Stars: ✭ 866 (+7116.67%)
Mutual labels:  plugin
Confiscate
Discover duplication glitches, abusive staff giving items, x-ray or simply poor server economy.
Stars: ✭ 23 (+91.67%)
Mutual labels:  plugin
Intellij Figlet
🔌A FIGlet-based ASCII Art generation plugin for IntelliJ based IDEs.
Stars: ✭ 27 (+125%)
Mutual labels:  plugin
Kafka Connect Elasticsearch Source
Kafka Connect Elasticsearch Source
Stars: ✭ 22 (+83.33%)
Mutual labels:  plugin
Boltzmannclean
Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (+91.67%)
Mutual labels:  data-cleaning
Vim Dirvish
Directory viewer for Vim ⚡️
Stars: ✭ 929 (+7641.67%)
Mutual labels:  plugin
Sublimeallautocomplete
Extend Sublime autocompletion to find matches in all open files of the current window
Stars: ✭ 906 (+7450%)
Mutual labels:  plugin
Prompt Password
This repository has been archived, use the built-in password prompt in Enquirer instead.
Stars: ✭ 8 (-33.33%)
Mutual labels:  plugin
Vagrant Parallels
Vagrant Parallels Provider
Stars: ✭ 893 (+7341.67%)
Mutual labels:  plugin
Totem Danmaku
totem-danmaku is a plugin that provides danmaku support for Totem Player. totem弹幕为Gnome下的totem播放器提供弹幕支持
Stars: ✭ 24 (+100%)
Mutual labels:  plugin
Blockcmd
A PocketMine-MP plugin to block certain commands from being used by players in your server
Stars: ✭ 12 (+0%)
Mutual labels:  plugin
Gs Blog
A simple and easy to use, yet powerful blog for GetSimple. With this plugin, you can create blog posts, sort posts by category, view posts in monthly archives, plus much more.
Stars: ✭ 8 (-33.33%)
Mutual labels:  plugin
Stencil Tailwind
TailwindCSS plugin for Stencil
Stars: ✭ 26 (+116.67%)
Mutual labels:  plugin

DataCleaner Moodle Module

Moodle DataCleaner is an anonymiser of your Moodle data.

Supported versions of Moodle: 2.6 to 3.7 inclusive

How it works

Standard practice when hosting most applications, Moodle included, is to have various environments in a 'pipeline' leading to production at the end. eg a typical flow might be dev > stage > prod but there could be as many as you want for various reasons, like load testing, penetration testing etc.

To test properly it's often useful to have real production data in these other environments, but there are downsides:

  • Usually production can be quite massive, we don't need or want it all and disk space can be a pain with multiple copies.
  • There may be sensitive data we don't want to expose to developers or testers, eg personal data, grades, uploaded assignments etc
  • Moodle is integrated with 3rd party systems and we don't want test systems interacting with real systems, eg sending emails, or touching assignments in Turnitin etc, ie we want to remove any API keys and other related config

So we need a way to 'clean' the database after a refresh, to reduce the size of the data, to remove anything sensitive, and to ensure it's not going to touch any other real system. This also needs to be configurable because every Moodle instance has different needs and there is no one-size-fits all approach. This could be configured outside Moodle in the deployments tools, but over time we have found the most flexible and easiest approach is to have this configuration inside Moodle itself, so our clients can directly make these decisions, and not be exposed to any of the complexity of our internal processes around continous integration and deployment.

Practically this means the cleaning configuration needs to be added into the production system (which initially sounds scary but isn't), then you refresh the database to another environment where it can be washed. There are multiple levels of safeguards in place to ensure this never gets run in production, which would of course be catastrophic:

  • It can only be run from the CLI. There is no GUI.
  • We store the hostname in the cleaning configuration data. If the hostname matches production, DataCleaner will not run. If this data is missing then it will not run.
  • Typically a refreshed database will be from a nightly snapshot and so the data should be slightly stale. If a non admin user has logged in recently, that's a sign this Moodle is being used, and the DataCleaner will not run.
  • If cron has run recently, DataCleaner will not run. This should only be run on a data washing instance, cron should not be needed here.
  • It can only be run if and only if a 'local_datacleaner_allowexecution = true;' has been added to config.php

Installation

The simplest method of installing the plugin is to choose "Download ZIP" on the right hand side of the Github page. Once you've done this, unzip the DataCleaner code and copy it to the local/datacleaner directory within your Moodle codebase. On most modern Linux systems, this can be accomplished with:

unzip ./mdl-local_datacleaner-master.zip
cp -r ./mdl-local_datacleaner-master <your_moodle_directory>/local/datacleaner

Once you've copied the plugin, you can finish the installation process by logging into your Moodle site as an administrator and visiting the "notifications" page:

<your.moodle.url>/admin/index.php

Your site should prompt you to upgrade.

Configuration

Once the installation process is complete, you'll be prompted to fill in some configuration details. Note that you MUST visit the DataCleaner config page to save the current wwwroot, or the cleaner will not run later in the other environments.

$CFG->local_datacleaner_allowexecution = true;

You have to add the config item above to your config.php in each of the environments you want the cleaner to run. DO NOT add that config setting to a Production environment!

There are multiple 'cleaners' which process different types of data in Moodle. Each one can be enabled individually and may have additional config settings.

You can find the DataCleaner configuration via the Moodle administration block:

Site Adminstration > Plugins > Local plugins > Data cleaner

Sub-plugin options

Enable the sub-plugin options to clean the corresponding data area.

Cleanup core:

Enable this sub-plugin to clean core configuration settings.

Remove config:

Enable this sub-plugin to clean configuration settings. This has its own Settings page.

Remove standard logs:

Enable to truncate the standard log table.

Remove users:

This will remove users who have not logged in for a specific number of days. This has its own Settings page.

Remove courses:

Remove courses older than a specific number of days and/or in specific categories. This has its own Settings page.

Scramble user data:

Enable this sub-plugin to anonymise user data. This has its own Settings page.

Clean grades:

Enable to delete grade history or replace with fake data. This has its own Settings page.

Replace URLs:

Enable to replace all occurrences of the production URL with another URL. This has its own Settings page.

Cleanup sitedata:

Clean orphaned files or replace with a generic file for the specific file type.

Cleanup email:

When a suffix has been configured in the settings, this will append that value to all emails. There is also a regular expression field that will ignore users when appending the suffix.

Also this will allow you to configure following Moodle settings:

  • noemailever
  • divertallemailsto
  • divertallemailsexcept

Environment matrix:

Notice: A soft dependency on local_envbar is required for populating the available environments that can be configured.

This facilitates searching values in the {config} and {config_plugins} tables to allow setting those values. Useful for scrubbing API keys to prevent them calling home on a development environment.

A CLI script exists to run the Environment matrix cleaner as a standalone operation.

sudo -u apache /usr/bin/php /<your_moodle_directory>/local/datacleaner/environment_matrix/cli/matrix_replace.php --run

An additional CLI flag has been implemented. --reset.

This flag will purge all other saved environment configuration so that the new instance only has one set of environment data.

Running

After installing and configuring DataCleaner, copy your database and optionally your site data to another Moodle instance.

From here run the cli script. On most modern Linux systems, this can be accomplished with:

sudo -u apache /usr/bin/php /<your_moodle_directory>/local/datacleaner/cli/clean.php --run

There are protections in place which prevent accidental running on this on your production system - which would of course be catastrophic!

More options

Run the cli script with --help for more options:

sudo -u apache /usr/bin/php /<your_moodle_directory>/local/datacleaner/cli/clean.php --help
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].