All Projects → IARCbioinfo → IARC-nf

IARCbioinfo / IARC-nf

Licence: GPL-3.0 license
List of IARC bioinformatics nextflow pipelines

Projects that are alternatives of or similar to IARC-nf

Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+2288.24%)
Mutual labels:  pipeline, ngs
ctdna-pipeline
A simplified pipeline for ctDNA sequencing data analysis
Stars: ✭ 29 (-14.71%)
Mutual labels:  pipeline, ngs
MTBseq source
MTBseq is an automated pipeline for mapping, variant calling and detection of resistance mediating and phylogenetic variants from illumina whole genome sequence data of Mycobacterium tuberculosis complex isolates.
Stars: ✭ 26 (-23.53%)
Mutual labels:  pipeline, ngs
needlestack
Multi-sample somatic variant caller
Stars: ✭ 45 (+32.35%)
Mutual labels:  pipeline, ngs
grape-nf
An automated RNA-seq pipeline using Nextflow
Stars: ✭ 30 (-11.76%)
Mutual labels:  pipeline, ngs
Ngseasy
Dockerised Next Generation Sequencing Pipeline (QC, Align, Calling, Annotation)
Stars: ✭ 80 (+135.29%)
Mutual labels:  pipeline, ngs
DNAscan
DNAscan is a fast and efficient bioinformatics pipeline that allows for the analysis of DNA Next Generation sequencing data, requiring very little computational effort and memory usage.
Stars: ✭ 36 (+5.88%)
Mutual labels:  pipeline, ngs
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (+229.41%)
Mutual labels:  pipeline, ngs
ngs-preprocess
A pipeline for preprocessing NGS data from Illumina, Nanopore and PacBio technologies
Stars: ✭ 22 (-35.29%)
Mutual labels:  pipeline, ngs
ngs pipeline
Exome/Capture/RNASeq Pipeline Implementation using snakemake
Stars: ✭ 40 (+17.65%)
Mutual labels:  pipeline, ngs
hyperdrive
Extensible streaming ingestion pipeline on top of Apache Spark
Stars: ✭ 31 (-8.82%)
Mutual labels:  pipeline
Apos.Content
Content builder library for MonoGame.
Stars: ✭ 14 (-58.82%)
Mutual labels:  pipeline
alignment-nf
Whole Exome/Whole Genome Sequencing alignment pipeline
Stars: ✭ 19 (-44.12%)
Mutual labels:  ngs
lung-image-analysis
A basic framework for pulmonary nodule detection and characterization in CT
Stars: ✭ 26 (-23.53%)
Mutual labels:  cancer
bitbucket-push-and-pull-request-plugin
Plugin for Jenkins v2.138.2 or later, that triggers job builds on Bitbucket's push and pull request events.
Stars: ✭ 47 (+38.24%)
Mutual labels:  pipeline
LabPype
Framework for Creating Pipeline Software
Stars: ✭ 18 (-47.06%)
Mutual labels:  pipeline
smag
Show Me A Graph - Command Line Graphing
Stars: ✭ 78 (+129.41%)
Mutual labels:  pipeline
ngsLD
Calculation of pairwise Linkage Disequilibrium (LD) under a probabilistic framework
Stars: ✭ 25 (-26.47%)
Mutual labels:  ngs
pipen
pipen - A pipeline framework for python
Stars: ✭ 82 (+141.18%)
Mutual labels:  pipeline
SalmonTE
SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
Stars: ✭ 63 (+85.29%)
Mutual labels:  pipeline

IARC bioinformatics nextflow pipelines (updated on 03/05/2022)

This page lists all the pipelines developed at IARC (mostly nextflow pipelines which are suffixed with -nf) and explains how to use them (at the bottom of the page)

IARC pipelines list

Raw NGS data processing

Name Latest version Maintained Description Tools used
alignment-nf v1.3 - March 2021 ✔️ Yes Performs BAM realignment or fastq alignment, with/without local indel realignment and base quality score recalibration bwa, samblaster, sambamba, samtools, AdapterRemoval, GATK, k8 javascript execution shell, bwa-postalt.js
BQSR-nf v1.1 - Apr 2020 ✔️ Yes Performs base quality score recalibration of bam files using GATK samtools, samblaster, sambamba, GATK
abra-nf v3.0 - Apr 2020 ✔️ Yes Runs ABRA (Assembly Based ReAligner) ABRA, bedtools, bwa, sambamba, samtools
gatk4-DataPreProcessing-nf Nov 2018 ? Performs bwa alignment and pre-processing (mark duplicates and recalibration) following GATK4 best practices - compatible with hg38 bwa, picard, GATK4, sambamba, qualimap
PostAlignment-nf Aug 2018 ? Perform post alignment on bam files samtools, sambamba, bwa-postalt.js
****************** *********** *********** ************************* ************************
marathon-wgs June 2018 ? Studies intratumor heterogeneity with Canopy bwa, platypus, strelka2, vt, annovar, R, Falcon, Canopy
ITH-nf Sept 2018 ? Perform intra-tumoral heterogeneity (ITH) analysis Strelka2 , Platypus, Bcftools, Tabix, Falcon, Canopy

RNA Seq

Name Latest version Maintained Description Tools used
RNAseq-nf v2.4 - Dec 2020 ✔️ Yes Performs RNAseq mapping, quality control, and reads counting - See also RNAseq_analysis_scripts for post-processing fastqc, RESeQC, multiQC, STAR, htseq, cutadapt, Python version > 2.7, trim_galore, hisat2, GATK, samtools
RNAseq-transcript-nf v2.2 - June 2020 ✔️ Yes Performs transcript identification and quantification from a series of BAM files StringTie
RNAseq-fusion-nf v1.1 - Aug 2020 ✔️ Yes Perform fusion-genes discovery from RNAseq data using STAR-Fusion STAR-Fusion
gene-fusions-nf v1 - Oct 2020 - updated Nov 2021 ✔️ Yes Perform fusion-genes discovery from RNAseq data using Arriba Arriba
quantiseq-nf v1.1 - July 2020 ✔️ Yes Quantify immune cell content from RNA-seq data quanTIseq

workflow

QC

Name Latest version Maintained Description Tools used
NGSCheckMate v1.1a - July 2021 ✔️ Yes Runs NGSCheckMate on BAM files to identify data files from a same indidual (i.e. check N/T pairs) NGSCheckMate
conpair-nf June 2018 ? Runs conpair (concordance and contamination estimator) conpair, Python 2.7, numpy 1.7.0 or higher, scipy 0.14.0 or higher, GATK 2.3 or higher
damage-estimator-nf June 2017 ? Runs "Damage Estimator" Damage Estimator, samtools, R with GGPLOT2 package
QC3 May 2016 No Runs QC on DNA seq data (raw data, aligned data and variant calls - forked from slzhao samtools
fastqc-nf v1.1 - July 2020 ✔️ Yes Runs fastqc and multiqc on DNA seq data (fastq data) FastQC, Multiqc
qualimap-nf v1.1 - Nov 2019 ✔️ Yes Performs quality control on bam files (WES, WGS and target alignment data) samtools, Qualimap, Multiqc
mpileup-nf Jan 2018 ? Computes bam coverage with samtools mpileup (bed parallelization) samtools,annovar
bamsurgeon-nf Mar 2019 ? Runs bamsurgeon (tool to add mutations to bam files) with step of variant simulation Python 2.7, bamsurgeon, R software (tested with R version 3.2.3)

Variant calling

Name Latest version Maintained Description Tools used
needlestack v1.1 - May 2019 ✔️ Yes Performs multi-sample somatic variant calling perl, bedtools, samtools and R software
target-seq Aug 2019 ? Whole pipeline to perform multi-sample somatic variant calling using Needlestack on targeted sequencing data abra2,QC3 ,needlestack, annovar and R software
strelka2-nf v1.2a - Dec 2020 ✔️ Yes Runs Strelka 2 (germline and somatic variant caller) Strelka2
strelka-nf Jun 2017 No Runs Strelka (germline and somatic variant caller) Strelka
mutect-nf v2.3 - July 2021 ✔️ Yes Runs Mutect on tumor-matched normal bam pairs Mutect and its dependencies (Java 1.7 and Maven 3.0+), bedtools
gatk4-HaplotypeCaller-nf Dec 2019 ? Runs variant calling in GVCF mode on bam files following GATK best practices GATK
gatk4-GenotypeGVCFs-nf Apr 2019 ? Runs joint genotyping on gvcf files following GATK best practices GATK
GVCF_pipeline-nf Nov 2016 ? Performs bam realignment and recalibration + variant calling in GVCF mode following GATK best practices bwa, samblaster, sambamba, GATK
platypus-nf v1.0 - Apr 2018 ? Runs Platypus (germline variant caller) Platypus
TCGA_platypus-nf Aug 2018 ? Converts TCGA Platypus vcf in format for annotation with annovar vt,VCFTools
vcf_normalization-nf v1.1 - May 2020 ✔️ Yes Decomposes and normalizes variant calls (vcf files) bcftools,samtools/htslib
TCGA_germline-nf May 2017 ? Extract germline variants from TCGA data for annotation with annovar (vcf files) R software
gama_annot-nf Aug 2020 ✔️ Yes Filter and annotate batch of vcf files (annovar + strand + context) annovar, R
table_annovar-nf v1.1.1 - Feb 2021 ✔️ Yes Annotate variants with annovar (vcf files) annovar
RF-mut-f Nov 2021 ✔️ Yes 🔴 NEW pipeline: Random forest implementation to filter germline mutations from tumor-only samples annovar
****************** *********** *********** ************************* ************************
MutSig Oct 2021 ✔️ Yes 🔴 NEW pipeline: Performs mutational signatures analysis of WGS data using SigProfilerExtractor SigProfilerExtractor
MutSpec v2.0 - May 2017 ? Suite of tools for analyzing and interpreting mutational signatures annovar
****************** *********** *********** ************************* ************************
purple-nf v1.1 - Nov 2021 ✔️ Yes 🔴 NEW pipeline: Performs copy number calling from tumor/normal or tumor-only sequencing data using PURPLE PURPLE
facets-nf v2.0 - Oct 2020 ✔️ Yes Performs fraction and copy number estimate from tumor/normal sequencing data using facets facets , R
CODEX-nf Mar 2017 ? Performs copy number variant calling from whole exome sequencing data using CODEX R with package Codex, Rscript
svaba-nf v1.0 - August 2020 ✔️ Yes Performs structural variant calling using SvABA SvABA , R
sv_somatic_cns-nf v1.0 - Nov 2021 ✔️ Yes Pipeline using multiple SV callers for consensus structural variant calling from tumor/normal sequencing data Delly, SvABA, Manta, SURVIVOR, bcftools, Samtools

Other tools/pipelines

Name Latest version Maintained Description Tools used
template-nf May 2020 ✔️ Yes Empty template for nextflow pipelines NA
data_test Aug 2020 ✔️ Yes Small data files to test IARC nextflow pipelines NA
bam2cram-nf v1.0 - Nov 2020 ✔️ Yes 🔴 NEW pipeline: convert bam files to cram files samtools
hla-neo-nf v1.1 - June 2021 ✔️ Yes 🔴 NEW pipeline: predict neoantigens from WGS of T/N pairs xHLA, VEP, pVACtools
PRSice Nov 2020 Pipeline to compute polygenic risk scores PRSice-2
methylkey May 2021 ✔️ Yes Pipeline for 450k and 850k array analysis (bisulfite data analysis using Minfi, Methylumi, Comet, Bumphunter and DMRcate packages) R software
AmpliconArchitect-nf v1.0 - Oct 2021 ✔️ Yes Discovers ecDNA in cancer genomes using AmpliconArchitect AmpliconArchitect
addreplacerg-nf Jan 2017 ? Adds and replaces read group tags in BAM files samtools
bametrics-nf Mar 2017 ? Computes average metrics from reads that overlap a given set of positions NA
Gviz_multiAlignments Aug 2017 ? Generates multiple BAM alignments views using Gviz bioconductor package Gviz
nf_coverage_demo v2.3 - July 2020 ✔️ Yes Plots mean coverage over a series of BAM files bedtools, R software
LiftOver-nf Nov 2017 ? Converts BED/VCF between hg19 and hg38 picard
MinION_pipes Jan 2020 ? Analyze MinION sequencing data for the reconstruction of viral genomes Guppy V3.1.5+, Porechop V0.2.4, Nanofilt V2.2.0, Filtlong V0.2.0, SPAdes V3.10.1, CAP3 02/10/15, BLAST V2.9.0+, MUSCLE V3.8.1551, Nanopolish V0.11.0, Minimap2 V2.15, Samtools version 1.9
DraftPolisher Jan 2020 ? Fast polishing of draft sequences (draft genome assembly) MUSCLE, Python3
Imputation-nf v1.1 - July 2021 ✔️ Yes Pipeline to perform dataset genotyping imputation LiftOver, Plink, Admixture, Perl, Term::ReadKey, Becftools, Eagle, Minimac4 and samtools
PVAmpliconFinder Aug 2020 ✔️ Yes Identify and classify known and potentially new papilliomaviridae sequences from amplicon deep-sequencing with degenerated papillomavirus primers. Python and Perl + FastQC, MultiQC, Trim Galore, VSEARCH, Blast, RaxML-EPA, PaPaRa, CAP3, KRONA)
integration_analysis_scripts Mar 2020 ✔️ Yes Performs unsupervised analyses (clustering) from transformed expression data (e.g., log fpkm) and methylation beta values R software with iClusterPlus, gplots and lattice R packages
mpileup2readcounts Apr 2018 ? Get the readcounts at a locus by piping samtools mpileup output - forked from gatoravi samtools
Methylation_analysis_scripts v1.0 - June 2020 - updated Nov 2021 ✔️ Yes Perform Illumina EPIC 850K array pre-processing and QC from idat files R software
DRMetrics Oct 2020 ✔️ Yes Evaluate the quality of projections obtained after using dimensionality reduction techniques R software
acnviewer-singularity Jul 2019 ? Build a singularity image of aCNViewer (tool for visualization of absolute copy number and copy neutral variations) ( Singularity
polysolver-singularity Dec 2019 ? Build a singularity image of Polysolver (tool for HLA typing based on whole exome seq) Singularity
scanMyWorkDir May 2018 ? Non-destructive and informative scan of a nextflow work folder NA

Courses and data notes

Name Description Tools used
nextflow-course-2018 Nextflow course NA
SBG-CGC_course2018 Analyzing TCGA data in SBG-CGC NA
Medical Genomics Course Medical Genomics course held at the INSA Lyon - updated Fall 2021 NA
intro-cancer-genomics Introduction to cancer genomics NA
mesomics_data_note Repository with code and datasets used in the mesomics data note manuscript NA

Tricks

Name Latest version Maintained Description Tools used
BAM-tricks Tips and tricks for BAM files samtools, freebayes, bedtools, biobambam2, Picard, rbamtools
VCF-tricks Tips and tricks for VCF files samtools,bcftools, vcflib, vcftools, R scripts
R-tricks Tips and tricks for R NA
EGA-tricks Tips and tricks to use the European Genome-Phenome Archive from the European Bioinformatics Institute EGA client
GDC-tricks Tips and tricks to use the GDC data portal NA
awesomeTCGA Curated list of resources to access TCGA data NA
LSF-Tricks Tips and tricks for LSF HPC scheduler NA

Coming soon... (only dev branches yet)

Name Description Tools used
ITH_pipeline Study intra-tumoral heterogeneity (ITH) through subclonality reconstruction HATCHet , DeCiFer, ClonEvol
Nextflow_DSL2 Repository with modules for nextflow DSL2 NA
variantflag Merge and annotate variants from different callers
EPIDRIVER2020 Scripts for EPIDRIVER Project

Outdated

Name Latest version Maintained Description Tools used
GATK-Alignment-nf June 2017 No Performs bwa alignment and pre-processing (realignment and recalibration) following first version of GATK best practices (less performant than alignment-nf ) bwa, picard, GATK

Installation

Nextflow

  1. Install java JRE if you don't already have it (7 or higher).

  2. Install nextflow.

    curl -fsSL get.nextflow.io | bash

    And move it to a location in your $PATH (/usr/local/bin for example here):

    sudo mv nextflow /usr/local/bin

Docker

To avoid having to installing all dependencies each time you use a pipeline, you can instead install docker and let nextflow dealing with it. Installing docker is system specific (but quite easy in most cases), follow  docker documentation (docker CE is sufficient). Also follow the post-installation step to manage Docker as a non-root user (here for Linux), otherwise you will need to change the sudo option in nextflow docker config scope as described in the nextflow documentation here.

To run nextflow pipeline with Docker, simply add the -with-docker option in the nextflow run command.

Singularity

To avoid having to installing all dependencies each time you use a pipeline, you can also install singularity and let nextflow dealing with it.

See documentation here.

In case you want to use the same singularity container - with the exactly same versions of pipeline and tools - on several data over time you may want to pull the container and archive it somewhere :

singularity pull shub://IARCbioinfo/pipeline-nf:v2.2

where "pipeline-nf" should be replaced by the name of the pipeline you want to use (example: RNAseq-nf) and 2.2 by the version of the pipeline you want to use (example: 2.4) This will create a singularity container file: pipeline-nf_v2.2.sif (example: RNAseq-nf_v2.4.sif) that you can then use by specifying it in the nextflow command (see usage)

=> example:

singularity pull shub://IARCbioinfo/RNAseq-nf:v2.4

Configuration file

Usage

nextflow run iarcbioinfo/pipeline_name -r X --input_folder xxx --output_folder xxx -params-file xxx.yml -w /scratch/work

OR USING SINGULARITY

nextflow run iarcbioinfo/pipeline_name -r X -profile singularity --input_folder xxx --output_folder xxx -params-file xxx.yml -w /scratch/work

OR USING SINGULARITY WITH SPECIFIC CONTAINER

nextflow run iarcbioinfo/pipeline_name -r X -with-singularity XXX.sif --input_folder xxx --output_folder xxx -params-file xxx.yml -w /scratch/work

Pipelines updates

You can update the nextflow sofware and the pipeline itself simply using:

nextflow -self-update
nextflow pull iarcbioinfo/pipeline_name

You can also automatically update the pipeline when you run it by adding the option -latest in the nextflow run command. Doing so you will always run the latest version from Github.

Display help

nextflow run iarcbioinfo/pipeline_name --help
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].