All Projects β†’ Illumina β†’ Strelka

Illumina / Strelka

Licence: other
Strelka2 germline and somatic small variant caller

Projects that are alternatives of or similar to Strelka

Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+885.25%)
Mutual labels:  bioinformatics
Bedops
πŸ”¬ BEDOPS: high-performance genomic feature operations
Stars: ✭ 215 (-11.89%)
Mutual labels:  bioinformatics
Hh Suite
Remote protein homology detection suite.
Stars: ✭ 230 (-5.74%)
Mutual labels:  bioinformatics
Intermine
A powerful open source data warehouse system
Stars: ✭ 195 (-20.08%)
Mutual labels:  bioinformatics
Awesome Cancer Variant Databases
A community-maintained repository of cancer clinical knowledge bases and databases focused on cancer variants.
Stars: ✭ 212 (-13.11%)
Mutual labels:  bioinformatics
Miniasm
Ultrafast de novo assembly for long noisy reads (though having no consensus step)
Stars: ✭ 216 (-11.48%)
Mutual labels:  bioinformatics
Seqan3
The modern C++ library for sequence analysis. Contains version 3 of the library and API docs.
Stars: ✭ 192 (-21.31%)
Mutual labels:  bioinformatics
Sourmash
Quickly search, compare, and analyze genomic and metagenomic data sets.
Stars: ✭ 237 (-2.87%)
Mutual labels:  bioinformatics
Awosome Bioinformatics
A curated list of resources for learning bioinformatics.
Stars: ✭ 214 (-12.3%)
Mutual labels:  bioinformatics
Dash
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+6290.16%)
Mutual labels:  bioinformatics
Sequenceserver
Intuitive local web frontend for the BLAST bioinformatics tool
Stars: ✭ 198 (-18.85%)
Mutual labels:  bioinformatics
Minigraph
Proof-of-concept seq-to-graph mapper and graph generator
Stars: ✭ 206 (-15.57%)
Mutual labels:  bioinformatics
Abyss
πŸ”¬ Assemble large genomes using short reads
Stars: ✭ 219 (-10.25%)
Mutual labels:  bioinformatics
Dgl Lifesci
Python package for graph neural networks in chemistry and biology
Stars: ✭ 194 (-20.49%)
Mutual labels:  bioinformatics
Deep learning examples
Examples of using deep learning in Bioinformatics
Stars: ✭ 234 (-4.1%)
Mutual labels:  bioinformatics
Raxml Ng
RAxML Next Generation: faster, easier-to-use and more flexible
Stars: ✭ 191 (-21.72%)
Mutual labels:  bioinformatics
React Cytoscapejs
React component for Cytoscape.js network visualisations
Stars: ✭ 217 (-11.07%)
Mutual labels:  bioinformatics
Single Cell Pseudotime
An overview of algorithms for estimating pseudotime in single-cell RNA-seq data
Stars: ✭ 239 (-2.05%)
Mutual labels:  bioinformatics
Homebrew Bio
πŸΊπŸ”¬ Bioinformatics formulae for the Homebrew package manager (macOS and Linux)
Stars: ✭ 237 (-2.87%)
Mutual labels:  bioinformatics
Bowtie
An ultrafast memory-efficient short read aligner
Stars: ✭ 221 (-9.43%)
Mutual labels:  bioinformatics

Strelka2 Small Variant Caller

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs. The germline caller employs an efficient tiered haplotype model to improve accuracy and provide read-backed phasing, adaptively selecting between assembly and a faster alignment-based haplotyping approach at each variant locus. The germline caller also analyzes input sequencing data using a mixture-model indel error estimation method to improve robustness to indel noise. The somatic calling model improves on the original Strelka method for liquid and late-stage tumor analysis by accounting for possible tumor cell contamination in the normal sample. A final empirical variant re-scoring step using random forest models trained on various call quality features has been added to both callers to further improve precision.

Compared with submissions to the recent PrecisionFDA Consistency and Truth challenges, the average indel F-score for Strelka2 running in its default configuration is 3.1% and 0.08% higher, respectively, than the best challenge submissions. Runtime on a 28-core server is ~40 minutes for 40x WGS germline analysis and ~3 hours for a 110x/40x WGS tumor-normal somatic analysis. More details on Strelka2 methods and benchmarking for both germline and somatic calling are described in:

Kim, S., Scheffler, K. et al. (2018) Strelka2: fast and accurate calling of germline and somatic variants. Nature Methods, 15, 591-594. doi:10.1038/s41592-018-0051-x

...and the corresponding open-access pre-print

Strelka accepts input read mappings from BAM or CRAM files, and optionally candidate and/or forced-call alleles from VCF. It reports all small variant predictions in VCF 4.1 format. Germline variant reporting uses the gVCF conventions to represent both variant and reference call confidence. For best somatic indel performance, Strelka is designed to be run with the Manta structural variant and indel caller, which provides additional indel candidates up to a given maximum indel size (49 by default). By design, Manta and Strelka run together with default settings provide complete coverage over all indel sizes (in additional to SVs and SNVs). See the user guide for a full description of capabilities and limitations.

Getting Started

To get started installing and using Strelka, please consult the quick start guide.

Data Analysis and Interpretation

After completing installation and reviewing the quick start guide, see the Strelka user guide for full instructions on how to run Strelka, interpret results and estimate hardware requirements/compute cost, in addition to a high-level methods overview.

License

Strelka source code is provided under the GPLv3 license. Strelka includes several third party packages provided under other open source licenses, please see COPYRIGHT.txt for additional details.

Strelka Code Development

For strelka code development and debugging details, see the Strelka developer guide. This includes details on Strelka's development protocols, special build instructions, recommended workflows for investigating calls, and internal documentation details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].