DrakeAn R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+2182.46%)
TargetsFunction-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+414.04%)
SteppyLightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (+108.77%)
targets-minimalA minimal example data analysis project with the targets R package
Stars: ✭ 50 (-12.28%)
Steppy ToolkitCurated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-63.16%)
Metaflow🚀 Build and manage real-life data science projects with ease!
Stars: ✭ 5,108 (+8861.4%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+140.35%)
BatchflowBatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (+173.68%)
GtsummaryPresentation-Ready Data Summary and Analytic Result Tables
Stars: ✭ 450 (+689.47%)
SarekDetect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (+117.54%)
VistrailsVisTrails is an open-source data analysis and visualization tool. It provides a comprehensive provenance infrastructure that maintains detailed history information about the steps followed and data derived in the course of an exploratory task: VisTrails maintains provenance of data products, of the computational processes that derive these products and their executions.
Stars: ✭ 94 (+64.91%)
PlynxPLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
Stars: ✭ 192 (+236.84%)
DNAscanDNAscan is a fast and efficient bioinformatics pipeline that allows for the analysis of DNA Next Generation sequencing data, requiring very little computational effort and memory usage.
Stars: ✭ 36 (-36.84%)
papers-as-modulesSoftware Papers as Software Modules: Towards a Culture of Reusable Results
Stars: ✭ 18 (-68.42%)
PolyaxonMachine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+5103.51%)
RnaseqRNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
Stars: ✭ 305 (+435.09%)
FunflowFunctional workflows
Stars: ✭ 318 (+457.89%)
DatmoOpen source production model management tool for data scientists
Stars: ✭ 334 (+485.96%)
PipelinePipeline is a package to build multi-staged concurrent workflows with a centralized logging output.
Stars: ✭ 433 (+659.65%)
WdlWorkflow Description Language - Specification and Implementations
Stars: ✭ 438 (+668.42%)
LabnotebookLabNotebook is a tool that allows you to flexibly monitor, record, save, and query all your machine learning experiments.
Stars: ✭ 526 (+822.81%)
DataexplorerAutomate Data Exploration and Treatment
Stars: ✭ 362 (+535.09%)
WorkflowrOrganize your project into a research website
Stars: ✭ 551 (+866.67%)
Moderndive bookStatistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+824.56%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (+935.09%)
r10e-ds-pyReproducible Data Science in Python (SciPy 2019 Tutorial)
Stars: ✭ 12 (-78.95%)
cli-property-managerUse this Property Manager CLI to automate Akamai property changes and deployments across many environments.
Stars: ✭ 22 (-61.4%)
reproducibleA set of tools for R that enhance reproducibility beyond package management
Stars: ✭ 33 (-42.11%)
DagsterAn orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+7091.23%)
SacredSacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.
Stars: ✭ 3,678 (+6352.63%)
bistroA library to build and execute typed scientific workflows
Stars: ✭ 43 (-24.56%)
CkCollective Knowledge framework (CK) helps to organize black-box research software as a database of reusable components and micro-services with common APIs, automation actions and extensible meta descriptions. See real-world use cases from Arm, General Motors, ACM, Raspberry Pi foundation and others:
Stars: ✭ 395 (+592.98%)
Production Data ScienceProduction Data Science: a workflow for collaborative data science aimed at production
Stars: ✭ 388 (+580.7%)
Rrtoolsrrtools: Tools for Writing Reproducible Research in R
Stars: ✭ 508 (+791.23%)
Awesome RA curated list of awesome R packages, frameworks and software.
Stars: ✭ 4,858 (+8422.81%)
ToilA scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
Stars: ✭ 733 (+1185.96%)
PrefectThe easiest way to automate your data
Stars: ✭ 7,956 (+13857.89%)
Recsys2019 deeplearning evaluationThis is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
Stars: ✭ 780 (+1268.42%)
ScipipeRobust, flexible and resource-efficient pipelines using Go and the commandline
Stars: ✭ 826 (+1349.12%)
GalaxyData intensive science for everyone.
Stars: ✭ 812 (+1324.56%)
Errormoji®️ errors, in emoji
Stars: ✭ 16 (-71.93%)
GitprQuick reference guide on fork and pull request workflow
Stars: ✭ 902 (+1482.46%)
CookiecutterDEPRECIATED! Please use nf-core/tools instead
Stars: ✭ 18 (-68.42%)
DatofutbolDato Fútbol repository
Stars: ✭ 23 (-59.65%)
BlogrScripts + data to recreate analyses published on http://benjaminlmoore.wordpress.com and http://blm.io
Stars: ✭ 23 (-59.65%)
NlpplnNLP pipeline software using common workflow language
Stars: ✭ 31 (-45.61%)
Mlj.jlA Julia machine learning framework
Stars: ✭ 982 (+1622.81%)