All Projects → lh3 → unimap

lh3 / unimap

Licence: MIT license
A EXPERIMENTAL fork of minimap2 optimized for assembly-to-reference alignment

Programming Languages

c
50402 projects - #5 most used programming language
Roff
2310 projects
Makefile
30231 projects

Projects that are alternatives of or similar to unimap

Biopython
Official git repository for Biopython (originally converted from CVS)
Stars: ✭ 2,936 (+3763.16%)
Mutual labels:  bioinformatics, genomics, sequence-alignment
Minigraph
Proof-of-concept seq-to-graph mapper and graph generator
Stars: ✭ 206 (+171.05%)
Mutual labels:  bioinformatics, genomics
Sequenceserver
Intuitive local web frontend for the BLAST bioinformatics tool
Stars: ✭ 198 (+160.53%)
Mutual labels:  bioinformatics, genomics
Scaff10X
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Stars: ✭ 21 (-72.37%)
Mutual labels:  bioinformatics, genomics
jgi-query
A simple command-line tool to download data from Joint Genome Institute databases
Stars: ✭ 38 (-50%)
Mutual labels:  bioinformatics, genomics
Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+3063.16%)
Mutual labels:  bioinformatics, genomics
Miniasm
Ultrafast de novo assembly for long noisy reads (though having no consensus step)
Stars: ✭ 216 (+184.21%)
Mutual labels:  bioinformatics, genomics
Deep Rules
Ten Quick Tips for Deep Learning in Biology
Stars: ✭ 179 (+135.53%)
Mutual labels:  bioinformatics, genomics
Canvasxpress
JavaScript VisualizationTools
Stars: ✭ 247 (+225%)
Mutual labels:  bioinformatics, genomics
Cyvcf2
cython + htslib == fast VCF and BCF processing
Stars: ✭ 243 (+219.74%)
Mutual labels:  bioinformatics, genomics
Genometools
GenomeTools genome analysis system.
Stars: ✭ 186 (+144.74%)
Mutual labels:  bioinformatics, genomics
wgs2ncbi
Toolkit for preparing genomes for submission to NCBI
Stars: ✭ 25 (-67.11%)
Mutual labels:  bioinformatics, genomics
Ribbon
A genome browser that shows long reads and complex variants better
Stars: ✭ 184 (+142.11%)
Mutual labels:  bioinformatics, genomics
Intermine
A powerful open source data warehouse system
Stars: ✭ 195 (+156.58%)
Mutual labels:  bioinformatics, genomics
Janggu
Deep learning infrastructure for bioinformatics
Stars: ✭ 174 (+128.95%)
Mutual labels:  bioinformatics, genomics
Bedops
🔬 BEDOPS: high-performance genomic feature operations
Stars: ✭ 215 (+182.89%)
Mutual labels:  bioinformatics, genomics
Roary
Rapid large-scale prokaryote pan genome analysis
Stars: ✭ 176 (+131.58%)
Mutual labels:  bioinformatics, genomics
Wgsim
Reads simulator
Stars: ✭ 178 (+134.21%)
Mutual labels:  bioinformatics, genomics
Bowtie
An ultrafast memory-efficient short read aligner
Stars: ✭ 221 (+190.79%)
Mutual labels:  bioinformatics, genomics
Hap.py
Haplotype VCF comparison tools
Stars: ✭ 249 (+227.63%)
Mutual labels:  bioinformatics, genomics

Getting Started

# compile and install
git clone https://github.com/lh3/unimap
cd unimap && make

# simple use cases (-b24 below saves memory for this toy example)
./unimap -b24 -c test/MT-human.fa test/MT-orang.fa > out.paf

# prebuild index; recommended as unimap indexing is slower than minimap2 indexing
./unimap -b24 -d MT-human.umi test/MT-human.fa
./unimap -c MT-human.umi test/MT-orang.fa > out.paf

# use presets (no test data)
unimap -cxasm5 --cs -t8 ref.fa asm.fa         # if asm.fa is near identical to ref.fa
unimap -cxhifi --cs -t8 ref.fa hifi-reads.fa  # HiFi reads to reference alignment
unimap -cxont  --cs -t8 ref.fa ont-reads.fa   # Nanopore reads to reference alignment

Introduction

Unimap is a fork of minimap2 optimized for assembly-to-reference alignment. It integrates the minigraph chaining algorithm and can align through long INDELs (up to 100kb by default) much faster than minimap2. Unimap is a better fit for resolving segmental duplications and is recommended over minimap2 for alignment between high-quality assemblies.

Unimap does not replace minimap2 for other types of alignment. It drops the support of multi-part index and short-read mapping. Its long-read alignment is different from minimap2 but is not necessarily better. Unimap is more of a specialized minimap2 at the moment.

Notes

  • With the default asm5 preset, unimap may align a highly diverged region as a long insertions followed by a long deletion. Truvari may identify two false positive calls in this case, but these arguably are not errors.

  • Unimap takes ~5 minutes to index a human genome, slower than minimap2. It is recommended to save the index for faster startup.

  • The default ont preset has been tuned for more recent Nanopore reads at ~95% accuracy.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].