All Projects → mozack → Abra2

mozack / Abra2

Licence: mit
ABRA2

Programming Languages

java
68154 projects - #9 most used programming language

Labels

Projects that are alternatives of or similar to Abra2

Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+3598.46%)
Mutual labels:  ngs, dna
Gatk
Official code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+1441.54%)
Mutual labels:  ngs, dna
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (+72.31%)
Mutual labels:  ngs, dna
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+1149.23%)
Mutual labels:  ngs, dna
Htsjdk
A Java API for high-throughput sequencing data (HTS) formats.
Stars: ✭ 220 (+238.46%)
Mutual labels:  ngs, dna
PHAT
Pathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform
Stars: ✭ 17 (-73.85%)
Mutual labels:  ngs, dna
STing
Ultrafast sequence typing and gene detection from NGS raw reads
Stars: ✭ 15 (-76.92%)
Mutual labels:  ngs, dna
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (-15.38%)
Mutual labels:  ngs, dna
Dna 3d Engine
3d engine implementation in DNA code!
Stars: ✭ 493 (+658.46%)
Mutual labels:  dna
Restez
😴 📂 Create and Query a Local Copy of GenBank in R
Stars: ✭ 22 (-66.15%)
Mutual labels:  dna
Icewater
16,432 Free Yara rules created by
Stars: ✭ 324 (+398.46%)
Mutual labels:  dna
Htslib
C library for high-throughput sequencing data formats
Stars: ✭ 529 (+713.85%)
Mutual labels:  ngs
Fusiondirect.jl
(No maintenance) Detect gene fusion directly from raw fastq files
Stars: ✭ 23 (-64.62%)
Mutual labels:  ngs
Deeptools
Tools to process and analyze deep sequencing data.
Stars: ✭ 448 (+589.23%)
Mutual labels:  ngs
Fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Stars: ✭ 966 (+1386.15%)
Mutual labels:  ngs
Jvarkit
Java utilities for Bioinformatics
Stars: ✭ 313 (+381.54%)
Mutual labels:  ngs
Pyfaidx
Efficient pythonic random access to fasta subsequences
Stars: ✭ 307 (+372.31%)
Mutual labels:  dna
Flu Prediction
Predicting Future Influenza Virus Sequences with Machine Learning
Stars: ✭ 20 (-69.23%)
Mutual labels:  dna
Ngsdist
Estimation of pairwise distances under a probabilistic framework
Stars: ✭ 6 (-90.77%)
Mutual labels:  ngs
Ngsf
Estimation of per-individual inbreeding coefficients under a probabilistic framework
Stars: ✭ 10 (-84.62%)
Mutual labels:  ngs

ABRA2

ABRA2 is an updated implementation of ABRA featuring:

  • RNA support
  • Improved scalability (Human whole genomes now supported)
  • Improved accuracy
  • Improved stability and usability (BWA is no longer required to run ABRA although we do recommend BWA as the initial aligner for DNA)

Manuscript: https://doi.org/10.1093/bioinformatics/btz033

Running

ABRA2 requires Java 8.

We recommend running from a pre-compiled release. Go to the Releases tab to download a recent version.

DNA

Sample command for DNA:

java -Xmx16G -jar abra2.jar --in normal.bam,tumor.bam --out normal.abra.bam,tumor.abra.bam --ref hg38.fa --threads 8 --targets targets.bed --tmpdir /your/tmpdir > abra.log

The above accepts normal.bam and tumor.bam as input and outputs sorted realigned BAM files named normal.abra.bam and tumor.abra.bam

  • Input files must be sorted by coordinate and index
  • Output files are sorted
  • The tmpdir may grow large. Be sure you have sufficient space there (at least equal to the input file size)
  • The targets argument is not required. When omitted, the entire genome will be eligible for realignment.

RNA

ABRA2 is capable of utilizing junction information to aid in assembly and realignment. It has been tested only on STAR output to date.

Sample command for RNA:

java -Xmx16G -jar abra2.jar --in star.bam --out star.abra.bam --ref hg38.fa --junctions bam --threads 8 --gtf gencode.v26.annotation.gtf --dist 500000 --sua --tmpdir /your/tmpdir > abra2.log 2>&1

Here, star.bam is the input bam file and star.abra.bam is the output bam file.

Junctions observed during alignment can be passed in using the --junctions param. The input file format is similar to the SJ.out.tab file output by STAR. If bam is specified, ABRA2 will dynamically identify splice junctions from the BAM file on the fly. Note that the SJ.out.tab file contains only junctions deemed "high quality" by STAR. The complete set of all splice junctions can be identified using the program abra.cadabra.SpliceJunctionCounter

Annotated junctions can be passed in using the --gtf param. See: https://www.gencodegenes.org/releases/current.html
It is beneficial to use both of the junction related options.

Known indels can be passed in using the --in-vcf argument. Unannotated junctions originally identified as splices by the aligner may be converted to deletions if a known deletion is matched. Consider this option if you have indels detected from DNA for the same sample / subject. It is not recommended to use large datasets when using this option (i.e. don't pass in dbSNP).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].