All Projects → pytask-dev → pytask

pytask-dev / pytask

Licence: other
pytask is a workflow management system which facilitates reproducible data analyses.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pytask

wrench
WRENCH: Cyberinfrastructure Simulation Workbench
Stars: ✭ 25 (-56.14%)
Mutual labels:  reproducible-research, scientific-workflows
showyourwork
Fully reproducible, open source scientific articles in LaTeX.
Stars: ✭ 361 (+533.33%)
Mutual labels:  reproducible-research, scientific-workflows
ck-crowd-scenarios
Public scenarios to crowdsource experiments (such as DNN crowd-benchmarking and crowd-tuning) using Collective Knowledge Framework across diverse mobile devices provided by volunteers. Results are continuously aggregated at the open repository of knowledge:
Stars: ✭ 22 (-61.4%)
Mutual labels:  reproducible-research
ITKSphinxExamples
Cookbook examples for the Insight Toolkit documented with Sphinx
Stars: ✭ 48 (-15.79%)
Mutual labels:  reproducible-research
DUN
Code for "Depth Uncertainty in Neural Networks" (https://arxiv.org/abs/2006.08437)
Stars: ✭ 65 (+14.04%)
Mutual labels:  reproducible-research
ukbrest
ukbREST: efficient and streamlined data access for reproducible research of large biobanks
Stars: ✭ 32 (-43.86%)
Mutual labels:  reproducible-research
reskit
A library for creating and curating reproducible pipelines for scientific and industrial machine learning
Stars: ✭ 27 (-52.63%)
Mutual labels:  reproducible-research
openscience
Empirical Software Engineering journal (EMSE) open science and reproducible research initiative
Stars: ✭ 28 (-50.88%)
Mutual labels:  reproducible-research
open-solution-googleai-object-detection
Open solution to the Google AI Object Detection Challenge 🍁
Stars: ✭ 46 (-19.3%)
Mutual labels:  reproducible-research
us-rawdata-sda
A Deep Learning Approach to Ultrasound Image Recovery
Stars: ✭ 39 (-31.58%)
Mutual labels:  reproducible-research
sunbeam
A robust, extensible metagenomics pipeline
Stars: ✭ 143 (+150.88%)
Mutual labels:  reproducible-research
microbiomeHD
Cross-disease comparison of case-control gut microbiome studies
Stars: ✭ 58 (+1.75%)
Mutual labels:  reproducible-research
steep
⤴️ Steep Workflow Management System – Run scientific workflows in the Cloud
Stars: ✭ 30 (-47.37%)
Mutual labels:  scientific-workflows
awflow
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!
Stars: ✭ 15 (-73.68%)
Mutual labels:  reproducible-research
OpenPlantPathology
Open Plant Pathology website
Stars: ✭ 18 (-68.42%)
Mutual labels:  reproducible-research
targets-minimal
A minimal example data analysis project with the targets R package
Stars: ✭ 50 (-12.28%)
Mutual labels:  reproducible-research
genepattern-notebook
Platform for integrating genomic analysis with Jupyter Notebooks.
Stars: ✭ 37 (-35.09%)
Mutual labels:  reproducible-research
nowplaying-RS-Music-Reco-FM
#nowplaying-RS: Music Recommendation using Factorization Machines
Stars: ✭ 23 (-59.65%)
Mutual labels:  reproducible-research
GeneTonic
Enjoy your transcriptomic data and analysis responsibly - like sipping a cocktail
Stars: ✭ 66 (+15.79%)
Mutual labels:  reproducible-research
Topcuoglu ML mBio 2020
Best practices for applying machine learning to bacterial 16S rRNA gene sequencing data
Stars: ✭ 21 (-63.16%)
Mutual labels:  reproducible-research

pytask


PyPI PyPI - Python Version image image PyPI - License image image image pre-commit.ci status image

pytask is a workflow management system that facilitates reproducible data analyses. Its features include:

  • Automatic discovery of tasks.
  • Lazy evaluation. If a task, its dependencies, and its products have not changed, do not execute it.
  • Debug mode. Jump into the debugger if a task fails, get feedback quickly, and be more productive.
  • Repeat a task with different inputs. Loop over task functions to run the same task with different inputs.
  • Select tasks via expressions. Run only a subset of tasks with expressions and marker expressions.
  • Easily extensible with plugins. pytask is built on top of pluggy, a plugin management framework, which allows you to adjust pytask to your needs. Plugins are available for parallelization, LaTeX, R, and Stata and more can be found here. Learn more about plugins in this tutorial.

Installation

pytask is available on PyPI and on Anaconda.org. Install the package with

$ pip install pytask

or

$ conda install -c conda-forge pytask

Color support is automatically available on non-Windows platforms. On Windows, please, use Windows Terminal which can be, for example, installed via the Microsoft Store.

To quickly set up a new project, use the cookiecutter-pytask-project template or start from other templates or example projects.

Usage

A task is a function that is detected if the module and the function name are prefixed with task_. Here is an example.

# Content of task_hello.py.

import pytask


@pytask.mark.produces("hello_earth.txt")
def task_hello_earth(produces):
    produces.write_text("Hello, earth!")

Here are some details:

  • Dependencies and products of a task are tracked via markers. Use @pytask.mark.depends_on for dependencies and @pytask.mark.produces for products. Values are strings or pathlib.Path and point to files on the disk.
  • Use produces (and depends_on) as function arguments to access the paths inside the function. pytask converts all paths to pathlib.Path's. Here, produces holds the path to "hello_earth.txt".

To execute the task, enter pytask on the command-line

image

Documentation

You find the documentation https://pytask-dev.readthedocs.io/en/stable with tutorials and guides for best practices.

Changes

Consult the release notes to find out about what is new.

License

pytask is distributed under the terms of the MIT license.

Acknowledgment

The license also includes a copyright and permission notice from pytest since some modules, classes, and functions are copied from pytest. Not to mention how pytest has inspired the development of pytask in general. Without the excellent work of Holger Krekel and pytest's many contributors, this project would not have been possible. Thank you!

pytask owes its beautiful appearance on the command line to rich, written by Will McGugan.

Repeating tasks in loops is inspired by ward written by Darren Burns.

Citation

If you rely on pytask to manage your research project, please cite it with the following key to helping others to discover the tool.

@Unpublished{Raabe2020,
    Title  = {A Python tool for managing scientific workflows.},
    Author = {Tobias Raabe},
    Year   = {2020},
    Url    = {https://github.com/pytask-dev/pytask}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].