All Projects → markvanderloo → lumberjack

markvanderloo / lumberjack

Licence: other
Track changes in data with ease

Programming Languages

r
7636 projects
TeX
3793 projects
Makefile
30231 projects
shell
77523 projects

Projects that are alternatives of or similar to lumberjack

Metaflow
🚀 Build and manage real-life data science projects with ease!
Stars: ✭ 5,108 (+8706.9%)
Mutual labels:  reproducible-research, datascience
ReproducibleScience
Short course on reproducible science: what, why, how
Stars: ✭ 23 (-60.34%)
Mutual labels:  reproducible-research
nyc-2019-scikit-sprint
NYC WiMLDS scikit-learn open source sprint (Aug 24, 2019)
Stars: ✭ 28 (-51.72%)
Mutual labels:  datascience
GeneTonic
Enjoy your transcriptomic data and analysis responsibly - like sipping a cocktail
Stars: ✭ 66 (+13.79%)
Mutual labels:  reproducible-research
RcppDynProg
Dynamic Programming implemented in Rcpp. Includes example partition and out of sample fitting applications.
Stars: ✭ 13 (-77.59%)
Mutual labels:  datascience
nowplaying-RS-Music-Reco-FM
#nowplaying-RS: Music Recommendation using Factorization Machines
Stars: ✭ 23 (-60.34%)
Mutual labels:  reproducible-research
Machine-learning
This repository will contain all the stuffs required for beginners in ML and DL do follow and star this repo for regular updates
Stars: ✭ 27 (-53.45%)
Mutual labels:  datascience
dst
yet another custom data science template via cookiecutter
Stars: ✭ 59 (+1.72%)
Mutual labels:  datascience
pytask
pytask is a workflow management system which facilitates reproducible data analyses.
Stars: ✭ 57 (-1.72%)
Mutual labels:  reproducible-research
snorkel
Snorkel - Bootstrap your Data Science
Stars: ✭ 24 (-58.62%)
Mutual labels:  datascience
targets-minimal
A minimal example data analysis project with the targets R package
Stars: ✭ 50 (-13.79%)
Mutual labels:  reproducible-research
Data-Science-Resources
A guide to getting started with Data Science and ML.
Stars: ✭ 17 (-70.69%)
Mutual labels:  datascience
Bike-Sharing-Demand-Kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Stars: ✭ 33 (-43.1%)
Mutual labels:  datascience
AgePredictor
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
Stars: ✭ 13 (-77.59%)
Mutual labels:  datascience
machine learning from scratch matlab python
Vectorized Machine Learning in Python 🐍 From Scratch
Stars: ✭ 28 (-51.72%)
Mutual labels:  datascience
ScalaTIKZ
ScalaTIKZ is an open-source library for PGF/TIKZ vector graphics.
Stars: ✭ 18 (-68.97%)
Mutual labels:  datascience
ITKSphinxExamples
Cookbook examples for the Insight Toolkit documented with Sphinx
Stars: ✭ 48 (-17.24%)
Mutual labels:  reproducible-research
open-solution-googleai-object-detection
Open solution to the Google AI Object Detection Challenge 🍁
Stars: ✭ 46 (-20.69%)
Mutual labels:  reproducible-research
RepSeP
Reproducible Self-Publishing - Demo Publications in the Most Common Formats
Stars: ✭ 14 (-75.86%)
Mutual labels:  reproducible-research
awesome-open-mlops
The Fuzzy Labs guide to the universe of open source MLOps
Stars: ✭ 304 (+424.14%)
Mutual labels:  datascience

Track changes in data

Build Status Coverage Status CRAN status DownloadsMentioned in Awesome Official Statistics

The lumberjack R package allows you to:

  • track changes in multiple data sets as they get processed;
  • using multiple loggers for each dataset;
  • where loggers are fully customizable.

You can get started by just adding one line of code to your existing data analysis script.

Citing lumberjack

Please cite the JSS paper.

@article{loo2020monitoring,
  title = {Monitoring Data in {R} with the {lumberjack} Package},
  author = {Mark P. J. {van der Loo}},
  journal = {Journal of Statistical Software},
  year = {2021},
  volume = {98},
  number = {1},
  pages = {1--13},
  doi = {10.18637/jss.v098.i01},
  url = {https://www.jstatsoft.org/article/view/v098i01}
}

lumberjack philosophy

Production scripts may contain many data transformations, aimed to clean, select, model, or augment data with new variables. Analyzing the effect of each step is cumbersome because it involves adding a lot of code that is not concerned with the primary goal of the script, namely to analyze and process data.

In the lumberjack philosophy, a programmer (analyst) should be only concerned with the primary process of data analyses.

Installation

Published version from CRAN

install.packages('lumberjack')

Development version.

git clone https://github.com/markvanderloo/lumberjack
cd lumbjerjack
make install

Copyright (2016) Mark van der Loo Licenced by EUPL 1.2

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].