All Projects → BenLangmead → Bowtie2

BenLangmead / Bowtie2

Licence: gpl-3.0
A fast and sensitive gapped read aligner

Projects that are alternatives of or similar to Bowtie2

Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (-24.93%)
Mutual labels:  bioinformatics, genomics
varsome-api-client-python
Example client programs for Saphetor's VarSome annotation API
Stars: ✭ 21 (-94.25%)
Mutual labels:  bioinformatics, genomics
fermikit
De novo assembly based variant calling pipeline for Illumina short reads
Stars: ✭ 98 (-73.15%)
Mutual labels:  bioinformatics, genomics
bacnet
BACNET is a Java based platform to develop website for multi-omics analysis
Stars: ✭ 12 (-96.71%)
Mutual labels:  bioinformatics, genomics
Gwa tutorial
A comprehensive tutorial about GWAS and PRS
Stars: ✭ 303 (-16.99%)
Mutual labels:  bioinformatics, genomics
Megahit
Ultra-fast and memory-efficient (meta-)genome assembler
Stars: ✭ 343 (-6.03%)
Mutual labels:  bioinformatics, genomics
gff3toembl
Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI
Stars: ✭ 27 (-92.6%)
Mutual labels:  bioinformatics, genomics
awesome-genetics
A curated list of awesome bioinformatics software.
Stars: ✭ 60 (-83.56%)
Mutual labels:  bioinformatics, genomics
Pyfaidx
Efficient pythonic random access to fasta subsequences
Stars: ✭ 307 (-15.89%)
Mutual labels:  bioinformatics, genomics
Vcfanno
annotate a VCF with other VCFs/BEDs/tabixed files
Stars: ✭ 259 (-29.04%)
Mutual labels:  bioinformatics, genomics
EarlGrey
Earl Grey: A fully automated TE curation and annotation pipeline
Stars: ✭ 25 (-93.15%)
Mutual labels:  bioinformatics, genomics
Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (-28.49%)
Mutual labels:  bioinformatics, genomics
dna-traits
A fast 23andMe genome text file parser, now superseded by arv
Stars: ✭ 64 (-82.47%)
Mutual labels:  bioinformatics, genomics
Jvarkit
Java utilities for Bioinformatics
Stars: ✭ 313 (-14.25%)
Mutual labels:  bioinformatics, genomics
tiptoft
Predict plasmids from uncorrected long read data
Stars: ✭ 27 (-92.6%)
Mutual labels:  bioinformatics, genomics
GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Stars: ✭ 19 (-94.79%)
Mutual labels:  bioinformatics, genomics
chromap
Fast alignment and preprocessing of chromatin profiles
Stars: ✭ 93 (-74.52%)
Mutual labels:  bioinformatics, genomics
netSmooth
netSmooth: A Network smoothing based method for Single Cell RNA-seq imputation
Stars: ✭ 23 (-93.7%)
Mutual labels:  bioinformatics, genomics
Bio.jl
[DEPRECATED] Bioinformatics and Computational Biology Infrastructure for Julia
Stars: ✭ 257 (-29.59%)
Mutual labels:  bioinformatics, genomics
Postgui
A React web application to query and share any PostgreSQL database.
Stars: ✭ 260 (-28.77%)
Mutual labels:  bioinformatics, genomics

Generic badge Build Status License: GPL v3

Overview

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

Obtaining Bowtie2

Bowtie 2 is available from various package managers, notably Bioconda. With Bioconda installed, you should be able to install Bowtie 2 with conda install bowtie2.

Containerized versions of Bowtie 2 are also available via the Biocontainers project (e.g. via Docker Hub).

You can also download Bowtie 2 sources and binaries from the "releases" tab on this page. Binaries are available for the x86_64 architecture running Linux, Mac OS X, and Windows. We are planning on adding experimental support for ARM-64 in an upcoming release. If you plan to compile Bowtie 2 yourself, make sure you have the TBB and zlib libraries installed. See the Building from source section of the manual for details.

Getting started

Looking to try out Bowtie 2? Check out the Bowtie 2 UI (currently in beta).

Alignment

bowtie2 takes a Bowtie 2 index and a set of sequencing read files and outputs a set of alignments in SAM format.

"Alignment" is the process by which we discover how and where the read sequences are similar to the reference sequence. An "alignment" is a result from this process, specifically: an alignment is a way of "lining up" some or all of the characters in the read with some characters from the reference in a way that reveals how they're similar. For example:

  Read:      GACTGGGCGATCTCGACTTCG
             |||||  |||||||||| |||
  Reference: GACTG--CGATCTCGACATCG

Where dash symbols represent gaps and vertical bars show where aligned characters match.

We use alignment to make an educated guess as to where a read originated with respect to the reference genome. It's not always possible to determine this with certainty. For instance, if the reference genome contains several long stretches of As (AAAAAAAAA etc.) and the read sequence is a short stretch of As (AAAAAAA), we cannot know for certain exactly where in the sea of As the read originated.

Examples

# Aligning unpaired reads
bowtie2 -x example/index/lambda_virus -U example/reads/longreads.fq

# Aligning paired reads
bowtie2 -x example/index/lambda_virus -1 example/reads/reads_1.fq -2 example/reads/reads_2.fq

Building an index

bowtie2-build builds a Bowtie index from a set of DNA sequences. bowtie2-build outputs a set of 6 files with suffixes .1.bt2, .2.bt2, .3.bt2, .4.bt2, .rev.1.bt2, and .rev.2.bt2. In the case of a large index these suffixes will have a bt2l termination. These files together constitute the index: they are all that is needed to align reads to that reference. The original sequence FASTA files are no longer used by Bowtie 2 once the index is built.

Bowtie 2's .bt2 index format is different from Bowtie 1's .ebwt format, and they are not compatible with each other.

Examples

# Building a small index
bowtie2-build example/reference/lambda_virus.fa example/index/lambda_virus

# Building a large index
bowtie2-build --large-index example/reference/lambda_virus.fa example/index/lambda_virus

Index inpection

bowtie2-inspect extracts information from a Bowtie 2 index about what kind of index it is and what reference sequences were used to build it. When run without any options, the tool will output a FASTA file containing the sequences of the original references (with all non-A/C/G/T characters converted to Ns). It can also be used to extract just the reference sequence names using the -n/--names option or a more verbose summary using the -s/--summary option.

Examples

# Inspecting a lambda_virus index (small index) and outputting the summary
bowtie2-inspect --summary example/index/lambda_virus

# Inspecting the entire lambda virus index (large index)
bowtie2-inspect --large-index example/index/lambda_virus

Publications

Bowtie 2 Papers

Related Publications

Related Work

Check out the Bowtie 2 UI, a shiny, frontend to the Bowtie 2 command line.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].