Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → TimoLassmann → kalign

TimoLassmann / kalign

Licence: GPL-3.0 license

A fast multiple sequence alignment program.

Programming Languages

50402 projects - #5 most used programming language

9771 projects

2310 projects

36643 projects - #6 most used programming language

Labels

bioinformatics sequence-alignment sequence-analysis

Projects that are alternatives of or similar to kalign

BioAlignments.jl

Sequence alignment tools

Stars: ✭ 49 (-44.94%)

Mutual labels: sequence-alignment, sequence-analysis

BISulfite-seq CUI Toolkit

Stars: ✭ 51 (-42.7%)

Mutual labels: sequence-alignment

Official git repository for Biopython (originally converted from CVS)

Stars: ✭ 2,936 (+3198.88%)

Mutual labels: sequence-alignment

For live demo, see http://lh3lh3.users.sourceforge.net/bioseq.shtml

Stars: ✭ 34 (-61.8%)

Mutual labels: sequence-alignment

lexicon-mono-seq

DOM Text Based Multiple Sequence Alignment Library

Stars: ✭ 15 (-83.15%)

Mutual labels: sequence-alignment

A EXPERIMENTAL fork of minimap2 optimized for assembly-to-reference alignment

Stars: ✭ 76 (-14.61%)

Mutual labels: sequence-alignment

Collection of sequence alignment algorithms.

Stars: ✭ 20 (-77.53%)

Mutual labels: sequence-alignment

Neural Networks for Protein Sequence Alignment

Stars: ✭ 29 (-67.42%)

Mutual labels: sequence-alignment

The Modular Aligner and The Modular SV Caller

Stars: ✭ 39 (-56.18%)

Mutual labels: sequence-alignment

SneakySnake🐍 is the first and the only pre-alignment filtering algorithm that works efficiently and fast on modern CPU, FPGA, and GPU architectures. It greatly (by more than two orders of magnitude) expedites sequence alignment calculation for both short and long reads. Described in the Bioinformatics (2020) by Alser et al. https://arxiv.org/abs…

Stars: ✭ 44 (-50.56%)

Mutual labels: sequence-alignment

Collection of commonly used RDP Tools for easy building

Stars: ✭ 44 (-50.56%)

Mutual labels: sequence-alignment

seqalign pathing

Rust implementation of sequence alignment / Levenshtein distance by A* acceleration of the DP algorithm

Stars: ✭ 17 (-80.9%)

Mutual labels: sequence-alignment

Fast alignment and preprocessing of chromatin profiles

Stars: ✭ 93 (+4.49%)

Mutual labels: sequence-analysis

Highly customizable, ambiguity-aware dotplots for visual sequence analyses

Stars: ✭ 73 (-17.98%)

Mutual labels: sequence-analysis

No description or website provided.

Stars: ✭ 103 (+15.73%)

Mutual labels: sequence-analysis

Tools and software library developed by the ONT Applications group

Stars: ✭ 57 (-35.96%)

Mutual labels: sequence-analysis

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data

Stars: ✭ 81 (-8.99%)

Mutual labels: sequence-analysis

SpacePHARER CRISPR Spacer Phage-Host pAiRs findER

Stars: ✭ 30 (-66.29%)

Mutual labels: sequence-analysis

Gfapy: a flexible and extensible software library for handling sequence graphs in Python

Stars: ✭ 54 (-39.33%)

Mutual labels: sequence-analysis

Kalign

Kalign is a fast multiple sequence alignment program for biological sequences.

Installation

Release Tarball

Download tarball from releases. Then:

tar -zxvf kalign-<version>.tar.gz
cd kalign-<version>
mkdir build 
cd build
cmake .. 
make 
make test 
make install

on macOS, install brew then:

brew install cmake 
git clone https://github.com/TimoLassmann/kalign.git
cd kalign
mkdir build
cd build 
cmake ..
make 
make test 
make install

Usage

The command line interface of Kalign accepts the following options:

Usage: kalign  -i <seq file> -o <out aln> 

Options:

   --format           : Output format. [Fasta]
   --type             : Alignment type (rna, dna, internal). [rna]
                        Options: protein, divergent (protein) 
                                 rna, dna, internal (nuc). 
   --gpo              : Gap open penalty. []
   --gpe              : Gap extension penalty. []
   --tgpe             : Terminal gap extension penalty. []
   -n/--nthreads      : Number of threads. [4]
   --version (-V/-v)  : Prints version. [NA]

Kalign expects the input to be a set of unaligned sequences in fasta format or aligned sequences in aligned fasta, MSF or clustal format. If the sequences are already aligned, kalign will remove all gap characters and re-align the sequences.

By default, Kalign automatically detects whether the input sequences are protein or DNA and selects appropriate alignment parameters.

The --type option gives users more direct control over the alignment parameters. Currently there are five core options:

protein : uses a the CorBLOSUM66_13plus substituion matrix (default for protein sequence)
divergent: uses the gonnet 250 substituion matrix
dna : default DNA parameters
- 5 match score
- -4 mismatch score
- -8 gap open penalty
- -6 gap extension penalty
- 0 terminal gap extension penalty
internal : same as above but terminal gaps set to 8 to encourage gaps within the sequences.
rna : parameters optimised for RNA alignments.

The --gpo, --gpe and --tgpe options can be used to further fine tune the parameters.

Examples

Passing sequences via stdin:

cat input.fa | kalign -f fasta > out.afa

Combining multiple input files:

kalign seqsA.fa seqsB.fa seqsC.fa -f fasta > combined.afa

Align sequences and output the alignment in MSF format:

kalign -i BB11001.tfa -f msf  -o out.msf

Align sequences and output the alignment in clustal format:

kalign -i BB11001.tfa -f clu -o out.clu

Re-align sequences in an existing alignment:

kalign -i BB11001.msf  -o out.afa

Reformat existing alignment:

kalign -i BB11001.msf -r afa -o out.afa

Kalign library

To incorporate Kalign into your own projects you can link to the library like this:

find_package(kalign)
target_link_libraries(<target> kalign::kalign)

Alternatively, you can include the kalign code directly in your project and link with:

if (NOT TARGET kalign)
  add_subdirectory(<path_to_kalign>/kalign EXCLUDE_FROM_ALL)
endif ()
target_link_libraries(<target> kalign::kalign)

Benchmark results

Here are some benchmark results. The code to reproduce these figures can be found at here.

Balibase

Bralibase

Please cite:

Lassmann, Timo. Kalign 3: multiple sequence alignment of large data sets. Bioinformatics (2019). pdf

Other papers:

Lassmann, Timo, Oliver Frings, and Erik LL Sonnhammer. Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucleic acids research 37.3 (2008): 858-865. Pubmed
Lassmann, Timo, and Erik LL Sonnhammer. Kalign: an accurate and fast multiple sequence alignment algorithm. BMC bioinformatics 6.1 (2005): 298. Pubmed

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 89

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗