aquaskyline / 16gt
Simultaneous detection of SNPs and Indels using a 16-genotype probabilistic model
Stars: ✭ 26
Programming Languages
perl
6916 projects
Labels
Projects that are alternatives of or similar to 16gt
Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (+903.85%)
Mutual labels: bioinformatics, vcf, genome
companion
This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion
Stars: ✭ 21 (-19.23%)
Mutual labels: bioinformatics, genome
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (+111.54%)
Mutual labels: bioinformatics, genome
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (+0%)
Mutual labels: bioinformatics, vcf
Cyvcf2
cython + htslib == fast VCF and BCF processing
Stars: ✭ 243 (+834.62%)
Mutual labels: bioinformatics, vcf
Genometools
GenomeTools genome analysis system.
Stars: ✭ 186 (+615.38%)
Mutual labels: bioinformatics, genome
GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Stars: ✭ 19 (-26.92%)
Mutual labels: bioinformatics, genome
Vcfanno
annotate a VCF with other VCFs/BEDs/tabixed files
Stars: ✭ 259 (+896.15%)
Mutual labels: bioinformatics, vcf
Abyss
🔬 Assemble large genomes using short reads
Stars: ✭ 219 (+742.31%)
Mutual labels: bioinformatics, genome
Helmsman
highly-efficient & lightweight mutation signature matrix aggregation
Stars: ✭ 19 (-26.92%)
Mutual labels: bioinformatics, vcf
Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+9146.15%)
Mutual labels: bioinformatics, genome
Scaff10X
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Stars: ✭ 21 (-19.23%)
Mutual labels: bioinformatics, genome
Karyoploter
karyoploteR - An R/Bioconductor package to plot arbitrary data along the genome
Stars: ✭ 192 (+638.46%)
Mutual labels: bioinformatics, genome
SVCollector
Method to optimally select samples for validation and resequencing
Stars: ✭ 20 (-23.08%)
Mutual labels: bioinformatics, vcf
Survivor
Toolset for SV simulation, comparison and filtering
Stars: ✭ 180 (+592.31%)
Mutual labels: bioinformatics, vcf
Ribbon
A genome browser that shows long reads and complex variants better
Stars: ✭ 184 (+607.69%)
Mutual labels: bioinformatics, genome
TypeTE
Genotyping of segregating mobile elements insertions
Stars: ✭ 15 (-42.31%)
Mutual labels: bioinformatics, vcf
Htslib
C library for high-throughput sequencing data formats
Stars: ✭ 529 (+1934.62%)
Mutual labels: bioinformatics, vcf
Setup
docker
git clone https://github.com/aquaskyline/16GT.git
cd 16GT
docker build --no-cache .
docker images
use the respective "IMAGE ID" displayed above as below
docker run -it --privileged <docker-id> /bin/bash
once inside the docker image, index the reference
cd /16GT/SOAP3-dp
./soap3-dp-builder <path-to-ref-gen-fasta>
./BGS-Build <path-to-ref-gen-fasta>.index
variant call using aligned/indexed bam file
cd /16GT
./bam2snapshot -i <path-to-ref-gen-fasta>.index -b <aligned-bam-file> -o <output-prefix>
./snapshotSnpcaller -i <path-to-ref-gen-fasta>.index -o <output-prefix>
perl txt2vcf.pl <output-prefix>.txt <pro-id> <path-to-ref-gen-fasta> > <output>.vcf
perl filterVCF.pl <output>.vcf > <output>.filtered.vcf
16GT
16GT is a variant caller utilizing a 16-genotype probabilistic model to unify SNP and indel calling in a single algorithm. 16GT is easy to use. The default parameters will fit most of the use cases with human genome. For the detailed parameters for each module, please run the module to get an info.
Quick start
Inputs: genome.fa alignments.bam, Output: .vcf
0. Install
git clone https://github.com/aquaskyline/16GT
cd 16GT
make
# Tested in Ubuntu 14.04 and CentOS 6.7 with GCC 4.7.2
1. Build reference index
git clone https://github.com/aquaskyline/SOAP3-dp.git
cd SOAP3-dp
make SOAP3-Builder
make BGS-Build
soap3-dp-builder genome.fa
BGS-Build genome.fa.index
2. Convert BAM to SNAPSHOT
bam2snapshot -i genome.fa.index -b alignments.bam -o output/prefix
3. Call
snapshotSnpcaller -i genome.fa.index -o output/prefix
perl txt2vcf.pl output/prefix.txt sampleName genome.fa > <output>.vcf
perl filterVCF.pl <output>.vcf dbSNP.vcf.gz > <output>.filtered.vcf
Exome variant calling
Inputs: genome.fa alignement.bam region.bed, Outputs: region.bin .vcf
RegionIndexBuilder genome.fa.index region.bed region.bin -bed/-gff
bam2snapshot -i genome.fa.index -b alignments.bam -o output/prefix -e region.bin
snapshotSnpcaller -i genome.fa.index -o output/prefix -e region.bin
License
GPLv3
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].