Track changes in data
The lumberjack
R package allows you to:
- track changes in multiple data sets as they get processed;
- using multiple loggers for each dataset;
- where loggers are fully customizable.
You can get started by just adding one line of code to your existing data analysis script.
Citing lumberjack
Please cite the JSS paper.
@article{loo2020monitoring,
title = {Monitoring Data in {R} with the {lumberjack} Package},
author = {Mark P. J. {van der Loo}},
journal = {Journal of Statistical Software},
year = {2021},
volume = {98},
number = {1},
pages = {1--13},
doi = {10.18637/jss.v098.i01},
url = {https://www.jstatsoft.org/article/view/v098i01}
}
lumberjack philosophy
Production scripts may contain many data transformations, aimed to clean, select, model, or augment data with new variables. Analyzing the effect of each step is cumbersome because it involves adding a lot of code that is not concerned with the primary goal of the script, namely to analyze and process data.
In the lumberjack philosophy, a programmer (analyst) should be only concerned with the primary process of data analyses.
Installation
Published version from CRAN
install.packages('lumberjack')
Development version.
git clone https://github.com/markvanderloo/lumberjack
cd lumbjerjack
make install
Copyright (2016) Mark van der Loo Licenced by EUPL 1.2