All Projects → stjudecloud → workflows

stjudecloud / workflows

Licence: MIT license
Bioinformatics workflows developed for and used on the St. Jude Cloud project.

Programming Languages

wdl
31 projects
Dockerfile
14818 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to workflows

hotsub
Command line tool to run batch jobs concurrently with ETL framework on AWS or other cloud computing resources
Stars: ✭ 29 (+81.25%)
Mutual labels:  workflow-engine, cwl, wdl-workflow, cwl-workflow
tibanna
Tibanna helps you run your genomic pipelines on Amazon cloud (AWS). It is used by the 4DN DCIC (4D Nucleome Data Coordination and Integration Center) to process data. Tibanna supports CWL/WDL (w/ docker), Snakemake (w/ conda) and custom Docker/shell command.
Stars: ✭ 61 (+281.25%)
Mutual labels:  cwl, wdl-workflow, cwl-workflow
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+4975%)
Mutual labels:  genomics, workflow-engine
aws-genomics-workflows
Genomics Workflows on AWS
Stars: ✭ 131 (+718.75%)
Mutual labels:  genomics, workflows
pm4ngs
Project Manager for NGS data analysis
Stars: ✭ 21 (+31.25%)
Mutual labels:  next-generation-sequencing, cwl-workflow
plasmidtron
Assembling the cause of phenotypes and genotypes from NGS data
Stars: ✭ 27 (+68.75%)
Mutual labels:  genomics, next-generation-sequencing
gff3toembl
Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI
Stars: ✭ 27 (+68.75%)
Mutual labels:  genomics, next-generation-sequencing
snp-sites
Finds SNP sites from a multi-FASTA alignment file
Stars: ✭ 182 (+1037.5%)
Mutual labels:  genomics, next-generation-sequencing
fq
Command line utility for manipulating Illumina-generated FastQ files.
Stars: ✭ 31 (+93.75%)
Mutual labels:  genomics, next-generation-sequencing
wdl2cwl
[Experimental] Workflow Definition Language (WDL) to CWL
Stars: ✭ 26 (+62.5%)
Mutual labels:  cwl, wdl-workflow
postier
Postier is a Laravel API automation platform to transfer data and to sync apps. You can build workflows with data and actions of multiple apps and apply logics to the data!
Stars: ✭ 55 (+243.75%)
Mutual labels:  workflow-engine, workflows
steep
⤴️ Steep Workflow Management System – Run scientific workflows in the Cloud
Stars: ✭ 30 (+87.5%)
Mutual labels:  workflow-engine, workflows
saffrontree
SaffronTree: Reference free rapid phylogenetic tree construction from raw read data
Stars: ✭ 17 (+6.25%)
Mutual labels:  genomics, next-generation-sequencing
wdlRunR
Elastic, reproducible, and reusable genomic data science tools from R backed by cloud resources
Stars: ✭ 34 (+112.5%)
Mutual labels:  genomics, cromwell
Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (+1612.5%)
Mutual labels:  genomics, workflow-engine
souporcell
Clustering scRNAseq by genotypes
Stars: ✭ 88 (+450%)
Mutual labels:  genomics, computational-biology
bac-genomics-scripts
Collection of scripts for bacterial genomics
Stars: ✭ 39 (+143.75%)
Mutual labels:  genomics, computational-biology
assembly improvement
Improve the quality of a denovo assembly by scaffolding and gap filling
Stars: ✭ 46 (+187.5%)
Mutual labels:  genomics, next-generation-sequencing
Aiida Core
The official repository for the AiiDA code
Stars: ✭ 238 (+1387.5%)
Mutual labels:  workflow-engine, workflows
mlst check
Multilocus sequence typing by blast using the schemes from PubMLST
Stars: ✭ 22 (+37.5%)
Mutual labels:  genomics, next-generation-sequencing

Build Status Documentation License: MIT

This repository contains all bioinformatics workflows used on the St. Jude Cloud project. Officially, the repository is in beta — the project is adding workflows as they are developed and put into production.

🏠 Homepage

Getting Started

At the time of writing, all workflows are written in WDL and are tested using Cromwell. We use Oliver to easily interact with the Cromwell server to perform various tasks. Although we do not test outside of Cromwell, we expect that the workflows will work just as well using other runners.

The easiest way to get started is to install bioconda and the run the following commands:

conda create -n workflows-dev -c conda-forge cromwell -y
conda activate workflows-dev
git clone [email protected]:stjudecloud/workflows.git
cd workflows

Any of the workflows in the workflows folder is a good place to start, e.g.

cromwell run workflows/reference/bootstrap-reference.wdl --inputs workflows/reference/inputs.json

Repository Structure

The repository is laid out as follows:

  • bin - Scripts used by Cromwell configuration settings. Add this to $PATH prior to using configurations in conf with Cromwell.
  • conf - Cromwell configuration files created for various environments that we use across our team. Feel free to use/fork/suggest improvements.
  • docker - Dockerfiles used in our workflows. All docker images are published to Docker Hub as a part of our CI and are versioned.
  • tools - All tools we have wrapped as individual WDL tasks.
  • workflows - Directory containing all end-to-end bioinformatics workflows.

Workflows Available

The current workflows exist in this repo with the following statuses:

Name Version Description Specification Workflow Status
RNA-Seq Standard v2.0.0 Standard RNA-Seq harmonization pipeline. Specification Realign BAM Workflow, FastQ Workflow In Production
Build STAR References N/A Build STAR aligner reference files used in RNA-Seq Standard harmonization pipelines. None Workflow In Production
Quality Check Standard v1.0.0 Perform ~10 different QC analyses on a BAM file and compile the results using MultiQC. Specification Workflow In Production
Build FastQ Screen References N/A Build references used in WGS/WES Quality Check pipeline for running FastQ Screen. None Workflow In Production
ESTIMATE v1.0.0 (beta) Runs the ESTIMATE software package on a feature counts file. None Workflow In Development
Calculate Gene Lengths N/A Produces a gene length file from a GTF. None Workflow In Production
Build BWA References N/A Builds reference files used by the BWA aligner. None Workflow In Production
BAM to FastQs v1.0.0 Split a BAM file into read groups, then read 1 FastQs and read 2 FastQs. None Workflow In Production

Author

👤 St. Jude Cloud Team

Tests

Given that this repo is still new, there are no tests. When we add tests, we will update the README.

🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check issues page. You can also take a look at the contributing guide.

📝 License

Copyright © 2020-Present St. Jude Cloud Team.
This project is MIT licensed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].