All Projects → voutcn → Megahit

voutcn / Megahit

Licence: gpl-3.0
Ultra-fast and memory-efficient (meta-)genome assembler

Projects that are alternatives of or similar to Megahit

GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Stars: ✭ 19 (-94.46%)
Mutual labels:  bioinformatics, genomics
Bio.jl
[DEPRECATED] Bioinformatics and Computational Biology Infrastructure for Julia
Stars: ✭ 257 (-25.07%)
Mutual labels:  bioinformatics, genomics
gff3toembl
Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI
Stars: ✭ 27 (-92.13%)
Mutual labels:  bioinformatics, genomics
EarlGrey
Earl Grey: A fully automated TE curation and annotation pipeline
Stars: ✭ 25 (-92.71%)
Mutual labels:  bioinformatics, genomics
Seq
A high-performance, Pythonic language for bioinformatics
Stars: ✭ 263 (-23.32%)
Mutual labels:  bioinformatics, genomics
bacnet
BACNET is a Java based platform to develop website for multi-omics analysis
Stars: ✭ 12 (-96.5%)
Mutual labels:  bioinformatics, genomics
Pyfaidx
Efficient pythonic random access to fasta subsequences
Stars: ✭ 307 (-10.5%)
Mutual labels:  bioinformatics, genomics
netSmooth
netSmooth: A Network smoothing based method for Single Cell RNA-seq imputation
Stars: ✭ 23 (-93.29%)
Mutual labels:  bioinformatics, genomics
Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (-23.91%)
Mutual labels:  bioinformatics, genomics
Postgui
A React web application to query and share any PostgreSQL database.
Stars: ✭ 260 (-24.2%)
Mutual labels:  bioinformatics, genomics
dna-traits
A fast 23andMe genome text file parser, now superseded by arv
Stars: ✭ 64 (-81.34%)
Mutual labels:  bioinformatics, genomics
Jvarkit
Java utilities for Bioinformatics
Stars: ✭ 313 (-8.75%)
Mutual labels:  bioinformatics, genomics
tiptoft
Predict plasmids from uncorrected long read data
Stars: ✭ 27 (-92.13%)
Mutual labels:  bioinformatics, genomics
fermikit
De novo assembly based variant calling pipeline for Illumina short reads
Stars: ✭ 98 (-71.43%)
Mutual labels:  bioinformatics, genomics
awesome-genetics
A curated list of awesome bioinformatics software.
Stars: ✭ 60 (-82.51%)
Mutual labels:  bioinformatics, genomics
varsome-api-client-python
Example client programs for Saphetor's VarSome annotation API
Stars: ✭ 21 (-93.88%)
Mutual labels:  bioinformatics, genomics
reg-gen
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
Stars: ✭ 64 (-81.34%)
Mutual labels:  bioinformatics, genomics
chromap
Fast alignment and preprocessing of chromatin profiles
Stars: ✭ 93 (-72.89%)
Mutual labels:  bioinformatics, genomics
Vcfanno
annotate a VCF with other VCFs/BEDs/tabixed files
Stars: ✭ 259 (-24.49%)
Mutual labels:  bioinformatics, genomics
Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (-20.12%)
Mutual labels:  bioinformatics, genomics

MEGAHIT

BioConda Install Downloads Build Status codecov

MEGAHIT is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly.

Installation

Conda

conda install -c bioconda megahit

Pre-built binaries for x86_64 Linux

wget https://github.com/voutcn/megahit/releases/download/v1.2.9/MEGAHIT-1.2.9-Linux-x86_64-static.tar.gz
tar zvxf MEGAHIT-1.2.9-Linux-x86_64-static.tar.gz
cd MEGAHIT-1.2.9-Linux-x86_64-static/bin/
./megahit --test  # run on a toy dataset
./megahit -1 MY_PE_READ_1.fq.gz -2 MY_PE_READ_2.fq.gz -o MY_OUTPUT_DIR

Pre-built docker image

# in the directory with the input reads
docker run -v $(pwd):/workspace -w /workspace --user $(id -u):$(id -g) vout/megahit \
  megahit -1 MY_PE_READ_1.fq.gz -2 MY_PE_READ_2.fq.gz -o MY_OUTPUT_DIR

Building from source

Prerequisites

  • For building: zlib, cmake >= 2.8, g++ >= 4.8.4
  • For running: gzip and bzip2
git clone https://github.com/voutcn/megahit.git
cd megahit
git submodule update --init
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release  # add -DCMAKE_INSTALL_PREFIX=MY_PREFIX if needed
make -j4
make simple_test  # will test MEGAHIT with a toy dataset
# make install if needed

Usage

Basic usage

megahit -1 pe_1.fq -2 pe_2.fq -o out  # 1 paired-end library
megahit --12 interleaved.fq -o out # one paired & interleaved paired-end library
megahit -1 a1.fq,b1.fq,c1.fq -2 a2.fq,b2.fq,c2.fq -r se1.fq,se2.fq -o out # 3 paired-end libraries + 2 SE libraries
megahit_core contig2fastg 119 out/intermediate_contigs/k119.contig.fa > k119.fastg # get FASTG from the intermediate contigs of k=119

The contigs can be found final.contigs.fa in the output directory.

Advanced usage

  • --kmin-1pass: if sequencing depth is low and too much memory used when build the graph of k_min
  • --presets meta-large: if the metagenome is complex (i.e., bio-diversity is high, for example soil metagenomes)
  • --cleaning-rounds 1 --disconnect-ratio 0: get less pruned assembly (usually shorter contigs)
  • --continue -o out: resume an interrupted job from out

To see the full manual, run megahit without parameters or with -h.

Also, our wiki may be helpful.

Publications

  • Li, D., Liu, C-M., Luo, R., Sadakane, K., and Lam, T-W., (2015) MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics, doi: 10.1093/bioinformatics/btv033 [PMID: 25609793].
  • Li, D., Luo, R., Liu, C.M., Leung, C.M., Ting, H.F., Sadakane, K., Yamashita, H. and Lam, T.W., 2016. MEGAHIT v1.0: A Fast and Scalable Metagenome Assembler driven by Advanced Methodologies and Community Practices. Methods.

License

This project is licensed under the GPLv3 License - see the LICENSE file for details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].