All Projects → wlandau → targets-minimal

wlandau / targets-minimal

Licence: Unknown, Unknown licenses found Licenses found Unknown LICENSE Unknown LICENSE.md
A minimal example data analysis project with the targets R package

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to targets-minimal

Drake Examples
Example workflows for the drake R package
Stars: ✭ 57 (+14%)
Mutual labels:  pipeline, reproducible-research, high-performance-computing, reproducibility
Targets
Function-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+486%)
Mutual labels:  pipeline, reproducible-research, high-performance-computing, reproducibility
targets-tutorial
Short course on the targets R package
Stars: ✭ 87 (+74%)
Mutual labels:  pipeline, reproducible-research, reproducibility, targets
Drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+2502%)
Mutual labels:  pipeline, reproducible-research, high-performance-computing, reproducibility
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-58%)
Mutual labels:  pipeline, reproducible-research, reproducibility
Steppy
Lightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (+138%)
Mutual labels:  pipeline, reproducible-research, reproducibility
stantargets
Reproducible Bayesian data analysis pipelines with targets and cmdstanr
Stars: ✭ 31 (-38%)
Mutual labels:  high-performance-computing, reproducibility, targets
open-solution-googleai-object-detection
Open solution to the Google AI Object Detection Challenge 🍁
Stars: ✭ 46 (-8%)
Mutual labels:  pipeline, reproducible-research, reproducibility
Ensembl Hive
EnsEMBL Hive - a system for creating and running pipelines on a distributed compute resource
Stars: ✭ 44 (-12%)
Mutual labels:  pipeline, high-performance-computing
Vistrails
VisTrails is an open-source data analysis and visualization tool. It provides a comprehensive provenance infrastructure that maintains detailed history information about the steps followed and data derived in the course of an exploratory task: VisTrails maintains provenance of data products, of the computational processes that derive these products and their executions.
Stars: ✭ 94 (+88%)
Mutual labels:  pipeline, reproducibility
Nextflow
A DSL for data-driven computational pipelines
Stars: ✭ 1,337 (+2574%)
Mutual labels:  pipeline, reproducible-research
Open Solution Home Credit
Open solution to the Home Credit Default Risk challenge 🏡
Stars: ✭ 397 (+694%)
Mutual labels:  pipeline, reproducibility
Sarek
Detect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (+148%)
Mutual labels:  pipeline, reproducible-research
fertile
creating optimal conditions for reproducibility
Stars: ✭ 52 (+4%)
Mutual labels:  reproducible-research, reproducibility
Segmentation
Catalyst.Segmentation
Stars: ✭ 27 (-46%)
Mutual labels:  pipeline, reproducibility
software-dev
Coding Standards for the USC Biostats group
Stars: ✭ 33 (-34%)
Mutual labels:  reproducible-research, reproducibility
reproducibility-guide
⛔ ARCHIVED ⛔
Stars: ✭ 119 (+138%)
Mutual labels:  reproducible-research, reproducibility
binderhub-deploy
Deploy a BinderHub from scratch on Microsoft Azure
Stars: ✭ 27 (-46%)
Mutual labels:  reproducible-research, reproducibility
bifrost
A stream processing framework for high-throughput applications.
Stars: ✭ 48 (-4%)
Mutual labels:  pipeline, high-performance-computing
ngs-preprocess
A pipeline for preprocessing NGS data from Illumina, Nanopore and PacBio technologies
Stars: ✭ 22 (-56%)
Mutual labels:  pipeline, reproducible-research

targets package minimal example

Launch RStudio Cloud

This repository is an example data analysis workflow with targets. The pipeline reads the data from a file, preprocesses it, visualizes it, and fits a regression model.

How to access

You can try out this example project as long as you have a browser and an internet connection. Click here to navigate your browser to an RStudio Cloud instance. Alternatively, you can clone or download this code repository and install the R packages listed here.

How to run

  1. Open the R console and call renv::restore() to install the required R packages.
  2. call the tar_make() function to run the pipeline.
  3. Then, call tar_read(hist) to retrieve the histogram.
  4. Experiment with other functions such as tar_visnetwork() to learn how they work.

File structure

The most important files are:

├── _targets.R
├── R/
├──── functions.R
├── data/
├──── raw_data.csv
└── index.Rmd
File Purpose
_targets.R The special R script that declares the targets pipeline. See tar_script() for details.
R/functions.R An R script with user-defined functions. Unlike _targets.R, there is nothing special about the name or location of this script. In fact, for larger projects, it is good practice to partition functions into multiple files.
data/raw_data.csv The raw airquality dataset.

index.Rmd: an R Markdown report that reruns in the pipeline whenever the histogram of ozone changes (details).

Continuous deployment

Minimal pipelines with low resource requirements are appropriate for continuous deployment. For example, when this particular GitHub repository is updated, its targets pipeline runs in a GitHub Actions workflow. The workflow pushes the results to the targets-runs branch, and GitHub Pages hosts the latest version of the rendered R Markdown report at https://wlandau.github.io/targets-minimal/. Subsequent runs restore the output files from the previous run so that up-to-date targets do not rebuild. Follow these steps to set up continuous deployment for your own minimal pipeline:

  1. Ensure your project stays within the storage and compute limitations of GitHub (i.e. your pipeline is minimal). For storage, you may choose the AWS-backed storage formats (e.g. tar_target(..., format = "aws_qs")) for large outputs to reduce the burden on GitHub storage.
  2. Ensure GitHub Actions are enabled in the Settings tab of your GitHub repository’s website.
  3. Set up your project with renv (details here).
    • Call targets::tar_renv(extras = character(0)) to write a _packages.R file to expose hidden dependencies.
    • Call renv::init() to initialize the renv lockfile renv.lock or renv::snapshot() to update it.
    • Commit renv.lock to your Git repository.
  4. Write the .github/workflows/targets.yaml workflow file using targets::tar_github_actions() and commit this file to Git.
  5. Push to GitHub. A GitHub Actions workflow should run the pipeline and upload the results to the targets-runs branch of your repository. Subsequent runs should add new commits but not necessarily rerun targets.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].