All Projects → kinto-b → makepipe

kinto-b / makepipe

Licence: GPL-3.0 license
Tools for constructing simple make-like pipelines in R.

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to makepipe

targets-tutorial
Short course on the targets R package
Stars: ✭ 87 (+278.26%)
Mutual labels:  pipeline, make
Targets
Function-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+1173.91%)
Mutual labels:  pipeline, make
Bulk Writer
Provides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (+813.04%)
Mutual labels:  pipeline
Shifu
An end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+800%)
Mutual labels:  pipeline
Morphl Community Edition
MorphL Community Edition uses big data and machine learning to predict user behaviors in digital products and services with the end goal of increasing KPIs (click-through rates, conversion rates, etc.) through personalization
Stars: ✭ 253 (+1000%)
Mutual labels:  pipeline
Redispipe
High-throughput Redis client for Go with implicit pipelining
Stars: ✭ 215 (+834.78%)
Mutual labels:  pipeline
makefile4latex
A GNU Makefile for typesetting LaTeX documents.
Stars: ✭ 21 (-8.7%)
Mutual labels:  make
Al usdmaya
This repo is no longer updated. Please see https://github.com/Autodesk/maya-usd
Stars: ✭ 253 (+1000%)
Mutual labels:  pipeline
assume-role-arn
🤖🎩assume-role-arn allows you to easily assume an AWS IAM role in your CI/CD pipelines, without worrying about external dependencies.
Stars: ✭ 54 (+134.78%)
Mutual labels:  pipeline
Mipt Mips
Cycle-accurate pre-silicon simulator of RISC-V and MIPS CPUs
Stars: ✭ 250 (+986.96%)
Mutual labels:  pipeline
Docker Android Build Box
An optimized docker image includes Android, Kotlin, Flutter sdk.
Stars: ✭ 245 (+965.22%)
Mutual labels:  pipeline
Bedops
🔬 BEDOPS: high-performance genomic feature operations
Stars: ✭ 215 (+834.78%)
Mutual labels:  pipeline
frizzle
The magic message bus
Stars: ✭ 14 (-39.13%)
Mutual labels:  pipeline
Hkube
🐟 High Performance Computing over Kubernetes - Core Repo 🎣
Stars: ✭ 214 (+830.43%)
Mutual labels:  pipeline
nextNEOpi
nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Stars: ✭ 42 (+82.61%)
Mutual labels:  pipeline
Flowcraft
FlowCraft: a component-based pipeline composer for omics analysis using Nextflow. 🐳📦
Stars: ✭ 208 (+804.35%)
Mutual labels:  pipeline
Cli
A CLI for interacting with Tekton!
Stars: ✭ 229 (+895.65%)
Mutual labels:  pipeline
pipe
Functional Pipeline in Go
Stars: ✭ 30 (+30.43%)
Mutual labels:  pipeline
scATAC-pro
A comprehensive tool for processing, analyzing and visulizing single cell chromatin accessibility sequencing data
Stars: ✭ 63 (+173.91%)
Mutual labels:  pipeline
pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
Stars: ✭ 132 (+473.91%)
Mutual labels:  pipeline

makepipe

Codecov test coverage CRAN status R-CMD-check

The goal of makepipe is to allow for the construction of make-like pipelines in R with very minimal overheads. In contrast to targets (and its predecessor drake) which offers an opinionated pipeline framework that demands highly functionalised code, makepipe is easy-going, being adaptable to a wide range of data science workflows.

A minimal example can be found here: https://github.com/kinto-b/makepipe_example

Installation

You can install the released version of makepipe from CRAN with:

install.packages("makepipe")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("kinto-b/makepipe")

Building a pipeline

To construct a pipeline, one simply needs to chain together a number of make_with_*() statements. When the pipeline is run through, each make_with_*() block is evaluated if and only if the targets are out-of-date with respect to the dependencies (and source file). But, whether or not the block is evaluated, a segment will be added to the Pipeline object behind the scenes. At the end of the script, once the entire pipeline has been run through, one can display the accumulated Pipeline object to produce a flow-chart visualisation of the pipeline. For example:

make_with_source(
  note = "Clean raw survey data and do derivations",
  source = "one.R",
  targets = "data/1 data.Rds",
  dependencies = c("data/raw.Rds", "lookup/concordance.csv")
)

make_with_recipe(
  label = "Merge it!",
  note = "Merge demographic variables from population data into survey data",
  recipe = {
    dat <- readRDS("data/1 data.Rds")
    pop <- readRDS("data/pop.Rds")
    merged_dat <- merge(dat, pop, by = "id")
    saveRDS(merged_dat, "data/2_data.Rds")
  },
  targets = c("data/2 data.Rds"),
  dependencies = c("data/1 data.Rds", "data/pop.Rds")
)

make_with_source(
  note = "Convert data from 'wide' to 'long' format",
  source = "three.R",
  targets = "data/3 data.Rds",
  dependencies = "data/2 data.Rds"
)

show_pipeline()

We can also get an interactive visNetwork widget:

show_pipeline(as = "visnetwork")

Or a text summary (which can be saved to a .md file),

show_pipeline(as = "text")

#> # Pipeline
#> 
#> ## one.R
#> 
#> Clean raw survey data and do derivations
#> 
#> * Source: 'one.R'
#> * Targets: 'data/1 data.Rds'
#> * File dependencies: 'data/raw.Rds', 'lookup/concordance.csv'
#> * Executed: FALSE
#> * Environment: 0x0000015399acfeb8
#> 
#> ## Merge it!
#> 
#> Merge demographic variables from population data into survey data
#> 
#> * Recipe: 
#> 
#> {
#>     dat <- readRDS("data/1 data.Rds")
#>     pop <- readRDS("data/pop.Rds")
#>     saveRDS(dat, "data/2_data.Rds")
#> }
#> 
#> * Targets: 'data/2 data.Rds'
#> * File dependencies: 'data/1 data.Rds', 'data/pop.Rds'
#> * Executed: TRUE
#> * Execution time: 0.00103879 secs
#> * Result: 0 object(s)
#> * Environment: 0x0000015390c6c568
#> 
#> ## three.R
#> 
#> Convert data from 'wide' to 'long' format
#> 
#> * Source: 'three.R'
#> * Targets: 'data/3 data.Rds'
#> * File dependencies: 'data/2 data.Rds'
#> * Executed: FALSE
#> * Environment: 0x00000153928570f8

Once you’ve constructed a pipeline, you can ‘clean’ it (i.e. delete all registered targets):

p <- get_pipeline()
p$clean()

Then, when you look again at the visualisation, the target nodes will be red not green since they’re out-of-date:

show_pipeline()

And then you can ‘rebuild’ to re-execute the entire pipeline and re-create the cleaned targets:

p <- get_pipeline()
p$build()

Another way to build a pipeline is to add a roxygen header into your .R scripts containing a special @makepipe tag along with the @targets, @dependencies, and so on. For example, at the top of script one.R you might have

#'@title Load
#'@description Clean raw survey data and do derivations
#'@dependencies "data/raw.Rds", "lookup/concordance.csv"
#'@targets "data/1 data.Rds"
#'@makepipe
NULL

You can then call make_with_dir(), which will construct a pipeline using all the scripts it finds in the provided directory containing the @makepipe tag.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].