All Projects → ga4gh → Benchmarking Tools

ga4gh / Benchmarking Tools

Licence: apache-2.0
Repository for the GA4GH Benchmarking Team work developing standardized benchmarking methods for germline small variant calls

Projects that are alternatives of or similar to Benchmarking Tools

graphsim
R package: Simulate Expression data from igraph network using mvtnorm (CRAN; JOSS)
Stars: ✭ 16 (-87.6%)
Mutual labels:  benchmarking, genomics
Circlator
A tool to circularize genome assemblies
Stars: ✭ 121 (-6.2%)
Mutual labels:  genomics
Smudgeplot
Inference of ploidy and heterozygosity structure using whole genome sequencing data
Stars: ✭ 98 (-24.03%)
Mutual labels:  genomics
Benchttp
HTTP server benchmarking tool
Stars: ✭ 114 (-11.63%)
Mutual labels:  benchmarking
Msprime
Simulate genealogical trees and genomic sequence data using population genetic models
Stars: ✭ 103 (-20.16%)
Mutual labels:  genomics
Qqman
An R package for creating Q-Q and manhattan plots from GWAS results
Stars: ✭ 115 (-10.85%)
Mutual labels:  genomics
Chipseq pipeline
AQUAS TF and histone ChIP-seq pipeline
Stars: ✭ 96 (-25.58%)
Mutual labels:  genomics
Pulkovo
Kotlin friendly library to measure elapsed time for methods, code blocks, RxJava chains
Stars: ✭ 126 (-2.33%)
Mutual labels:  benchmarking
Hicexplorer
HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
Stars: ✭ 116 (-10.08%)
Mutual labels:  genomics
Dlcookbook Dlbs
Deep Learning Benchmarking Suite
Stars: ✭ 114 (-11.63%)
Mutual labels:  benchmarking
Dynbenchmark
Comparison of methods for trajectory inference on single-cell data 🥇
Stars: ✭ 111 (-13.95%)
Mutual labels:  benchmarking
Benchexec
BenchExec: A Framework for Reliable Benchmarking and Resource Measurement
Stars: ✭ 108 (-16.28%)
Mutual labels:  benchmarking
Reproducible Image Denoising State Of The Art
Collection of popular and reproducible image denoising works.
Stars: ✭ 1,776 (+1276.74%)
Mutual labels:  benchmarking
Genomics
A collection of scripts and notes related to genomics and bioinformatics
Stars: ✭ 101 (-21.71%)
Mutual labels:  genomics
Kmer Cnt
Code examples of fast and simple k-mer counters for tutorial purposes
Stars: ✭ 124 (-3.88%)
Mutual labels:  genomics
Ariba
Antimicrobial Resistance Identification By Assembly
Stars: ✭ 96 (-25.58%)
Mutual labels:  genomics
Cgranges
A C/C++ library for fast interval overlap queries (with a "bedtools coverage" example)
Stars: ✭ 111 (-13.95%)
Mutual labels:  genomics
Cooler
A cool place to store your Hi-C
Stars: ✭ 112 (-13.18%)
Mutual labels:  genomics
Somalier
fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
Stars: ✭ 128 (-0.78%)
Mutual labels:  genomics
Sarek
Detect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (-3.88%)
Mutual labels:  genomics

Germline Small Variant Benchmarking Tools and Standards

This repository hosts the work of the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team, which is developing standardized performance metrics and tools for benchmarking germline small variant calls. This Team includes representatives from sequencing technology developers, government agencies, academic bioinformatics researchers, clinical laboratories, and commercial technology and bioinformatics developers. We have worked towards solutions for several challenges faced when benchmarking variant calls, including (1) defining high-confidence variant calls and regions that can be used as a benchmark, (2) developing tools to compare variant calls robust to differing representations, (3) defining performance metrics like false positive and false negative with respect to different matching stringencies, and (4) developing methods to stratify performance by variant type and genome context. We also provide links to our reference benchmarking engines and their implementations, as well as to benchmarking datasets.

A manuscript from the GA4GH Benchmarking Team describing best practices for benchmarking germline small variant calls is on bioRxiv, and we ask that you cite this publication in any work using these tools: https://doi.org/10.1101/270157

** Note: This site is still a work in progress. **

Standards and Definitions

See doc/standards/ for the current benchmarking standards and definitions.

Reference tool implementations

The primary reference implementation of the GA4GH Benchmarking methods is hap.py, which enables users to choose between vcfeval (recommended) and xcmp as the comparison engine, and use of GA4GH stratification bed files to assess performance in different genome contexts. A web-based implementation of this tool is available in GA4GH Benchmarking app from peter.krusche on precisionFDA.

Other reference implementations following the standards outlined above are available at tools/. These are submodules which link to the original tool repositories.

Benchmarking Intermediate Files

The benchmarking process contains a variety of steps and inputs. In doc/ref-impl/, we standardise intermediate formats for specifying truth sets, stratification regions, and intermediate outputs from comparison tools.

Benchmarking resources

In resources/, we provide files useful in the benchmarking process. Currently, this includes links to benchmarking calls and datasets from Genome in a Bottle and Illumina Platinum Genomes, as well as standardized bed files describing potentially difficult regions for performance stratification.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].