All Projects → aquaskyline → 16gt

aquaskyline / 16gt

Simultaneous detection of SNPs and Indels using a 16-genotype probabilistic model

Programming Languages

perl
6916 projects

Projects that are alternatives of or similar to 16gt

Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (+903.85%)
Mutual labels:  bioinformatics, vcf, genome
companion
This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion
Stars: ✭ 21 (-19.23%)
Mutual labels:  bioinformatics, genome
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (+111.54%)
Mutual labels:  bioinformatics, genome
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (+0%)
Mutual labels:  bioinformatics, vcf
Cyvcf2
cython + htslib == fast VCF and BCF processing
Stars: ✭ 243 (+834.62%)
Mutual labels:  bioinformatics, vcf
Hap.py
Haplotype VCF comparison tools
Stars: ✭ 249 (+857.69%)
Mutual labels:  bioinformatics, vcf
Hail
Scalable genomic data analysis.
Stars: ✭ 706 (+2615.38%)
Mutual labels:  bioinformatics, vcf
Genometools
GenomeTools genome analysis system.
Stars: ✭ 186 (+615.38%)
Mutual labels:  bioinformatics, genome
GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Stars: ✭ 19 (-26.92%)
Mutual labels:  bioinformatics, genome
Vcfanno
annotate a VCF with other VCFs/BEDs/tabixed files
Stars: ✭ 259 (+896.15%)
Mutual labels:  bioinformatics, vcf
Abyss
🔬 Assemble large genomes using short reads
Stars: ✭ 219 (+742.31%)
Mutual labels:  bioinformatics, genome
Helmsman
highly-efficient & lightweight mutation signature matrix aggregation
Stars: ✭ 19 (-26.92%)
Mutual labels:  bioinformatics, vcf
Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+9146.15%)
Mutual labels:  bioinformatics, genome
Scaff10X
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Stars: ✭ 21 (-19.23%)
Mutual labels:  bioinformatics, genome
Karyoploter
karyoploteR - An R/Bioconductor package to plot arbitrary data along the genome
Stars: ✭ 192 (+638.46%)
Mutual labels:  bioinformatics, genome
SVCollector
Method to optimally select samples for validation and resequencing
Stars: ✭ 20 (-23.08%)
Mutual labels:  bioinformatics, vcf
Survivor
Toolset for SV simulation, comparison and filtering
Stars: ✭ 180 (+592.31%)
Mutual labels:  bioinformatics, vcf
Ribbon
A genome browser that shows long reads and complex variants better
Stars: ✭ 184 (+607.69%)
Mutual labels:  bioinformatics, genome
TypeTE
Genotyping of segregating mobile elements insertions
Stars: ✭ 15 (-42.31%)
Mutual labels:  bioinformatics, vcf
Htslib
C library for high-throughput sequencing data formats
Stars: ✭ 529 (+1934.62%)
Mutual labels:  bioinformatics, vcf

Setup

docker

git clone https://github.com/aquaskyline/16GT.git
cd 16GT
docker build --no-cache .
docker images

use the respective "IMAGE ID" displayed above as below

docker run -it --privileged <docker-id> /bin/bash

once inside the docker image, index the reference

cd /16GT/SOAP3-dp
./soap3-dp-builder <path-to-ref-gen-fasta>
./BGS-Build <path-to-ref-gen-fasta>.index

variant call using aligned/indexed bam file

cd /16GT
./bam2snapshot -i <path-to-ref-gen-fasta>.index -b <aligned-bam-file> -o <output-prefix>
./snapshotSnpcaller  -i <path-to-ref-gen-fasta>.index  -o <output-prefix>
perl txt2vcf.pl <output-prefix>.txt <pro-id> <path-to-ref-gen-fasta> > <output>.vcf
perl filterVCF.pl <output>.vcf > <output>.filtered.vcf

16GT

16GT is a variant caller utilizing a 16-genotype probabilistic model to unify SNP and indel calling in a single algorithm. 16GT is easy to use. The default parameters will fit most of the use cases with human genome. For the detailed parameters for each module, please run the module to get an info.

Quick start

Inputs: genome.fa alignments.bam, Output: .vcf

0. Install

git clone https://github.com/aquaskyline/16GT
cd 16GT
make
# Tested in Ubuntu 14.04 and CentOS 6.7 with GCC 4.7.2

1. Build reference index

git clone https://github.com/aquaskyline/SOAP3-dp.git
cd SOAP3-dp
make SOAP3-Builder
make BGS-Build
soap3-dp-builder genome.fa
BGS-Build genome.fa.index

2. Convert BAM to SNAPSHOT

bam2snapshot -i genome.fa.index -b alignments.bam -o output/prefix

3. Call

snapshotSnpcaller -i genome.fa.index -o output/prefix
perl txt2vcf.pl output/prefix.txt sampleName genome.fa > <output>.vcf
perl filterVCF.pl <output>.vcf dbSNP.vcf.gz > <output>.filtered.vcf

Exome variant calling

Inputs: genome.fa alignement.bam region.bed, Outputs: region.bin .vcf

RegionIndexBuilder genome.fa.index region.bed region.bin -bed/-gff
bam2snapshot -i genome.fa.index -b alignments.bam -o output/prefix -e region.bin
snapshotSnpcaller -i genome.fa.index -o output/prefix -e region.bin

License

GPLv3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].