All Projects → adigenova → fast-sg

adigenova / fast-sg

Licence: MIT license
Fast-SG: An alignment-free algorithm for ultrafast scaffolding graph construction from short or long reads.

Programming Languages

C++
36643 projects - #6 most used programming language
perl
6916 projects
shell
77523 projects
Makefile
30231 projects
awk
318 projects

Projects that are alternatives of or similar to fast-sg

wengan
An accurate and ultra-fast hybrid genome assembler
Stars: ✭ 81 (+268.18%)
Mutual labels:  nanopore, pacbio, illumina
MGSE
Mapping-based Genome Size Estimation (MGSE) performs an estimation of a genome size based on a read mapping to an existing genome sequence assembly.
Stars: ✭ 22 (+0%)
Mutual labels:  pacbio, illumina, genome-assembly
haslr
A fast tool for hybrid genome assembly of long and short reads
Stars: ✭ 68 (+209.09%)
Mutual labels:  nanopore, pacbio, genome-assembly
dentist
Close assembly gaps using long-reads at high accuracy.
Stars: ✭ 39 (+77.27%)
Mutual labels:  pacbio, genome-assembly
CliqueSNV
No description or website provided.
Stars: ✭ 13 (-40.91%)
Mutual labels:  pacbio, illumina
Winnowmap
Long read / genome alignment software
Stars: ✭ 151 (+586.36%)
Mutual labels:  nanopore, pacbio
tiptoft
Predict plasmids from uncorrected long read data
Stars: ✭ 27 (+22.73%)
Mutual labels:  nanopore, pacbio
instaGRAAL
Large genome reassembly based on Hi-C data, continuation of GRAAL
Stars: ✭ 32 (+45.45%)
Mutual labels:  scaffolding, genome-assembly
ngs-preprocess
A pipeline for preprocessing NGS data from Illumina, Nanopore and PacBio technologies
Stars: ✭ 22 (+0%)
Mutual labels:  pacbio, illumina
redundans
Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
Stars: ✭ 90 (+309.09%)
Mutual labels:  scaffolding, genome-assembly
IsoQuant
Reference-based transcript discovery from long RNA read
Stars: ✭ 26 (+18.18%)
Mutual labels:  nanopore, pacbio
generator-stencil
Scaffolding tool 🔨 for Stencil js applications
Stars: ✭ 16 (-27.27%)
Mutual labels:  scaffolding
FOF3-Basic
A hello world type example for Akeeba F0F3 as a walkthrough for building a Joomla! component from the ground up.
Stars: ✭ 14 (-36.36%)
Mutual labels:  scaffolding
Scaff10X
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Stars: ✭ 21 (-4.55%)
Mutual labels:  scaffolding
nanoflow
🔬 De novo assembly of nanopore reads using nextflow
Stars: ✭ 20 (-9.09%)
Mutual labels:  nanopore
create-new-app
Easily generate a new fully-equiped React project, optionally with Redux, React Router, Express, or MongoDB.
Stars: ✭ 26 (+18.18%)
Mutual labels:  scaffolding
StrobeAlign
Aligns short reads using dynamic seed size with strobemers
Stars: ✭ 49 (+122.73%)
Mutual labels:  illumina
assemble-core
The core assemble application with no presets or defaults. All configuration is left to the implementor.
Stars: ✭ 17 (-22.73%)
Mutual labels:  scaffolding
vbz compression
VBZ compression plugin for nanopore signal data
Stars: ✭ 31 (+40.91%)
Mutual labels:  nanopore
sample-sheet
A permissively licensed library designed to replace Illumina's Experiment Manager
Stars: ✭ 42 (+90.91%)
Mutual labels:  illumina

Fast-SG

Fast-SG

Fast-SG is an alignment-free algorithm for ultrafast scaffolding graph construction from short or long reads.

Table of contents

Compilation Instructions

Get Fast-SG code

git clone https://github.com/adigenova/fastsg

Get KMC (k-mer counter)

Mac users

Obtain pre-compiled binaries from:

wget  https://github.com/refresh-bio/KMC/releases/download/v3.0.0/KMC3.mac.tar.gz 

or compile yourself following the instructions provided in:

https://github.com/refresh-bio/KMC

Linux users

Obtain pre-compiled binaries from:

wget https://github.com/refresh-bio/KMC/releases/download/v3.0.0/KMC3.linux.tar.gz

or compile yourself following the instructions provided in:

https://github.com/refresh-bio/KMC

After getting KMC3, put the binaries inside Fast-SG directory, specifically in:

$Fast-SG/KMC/bin/kmc

$Fast-SG/KMC/bin/kmc_dump

Compile Fast-SG

make all

c++ compiler; compilation was tested with g++ version 5.3 (Linux) and clang version 4.2 (Mac OSX).

Test Fast-SG (small test)

make test

Libraries used

Currently Fast-SG uses the following C++ opensource libraries:

1.- kseqcpp (https://github.com/ctSkennerton/kseqcpp.git checkout cfa50bcd17bbcb3225d431df4a2c1396f58a0993)

2.- ntHash (https://github.com/bcgsc/ntHash.git checkout ff326a8c9ccf6186f42c1f49950c1ebaadbd7f7a)

3.- BBHash (https://github.com/rizkg/BBHash.git checkout 99c905828a58fa119979df5c26bdbea93f0a7696)

4,- quasi_dictionary (https://github.com/pierrepeterlongo/quasi_dictionary.git checkout 9e8c64b150b129035f92d010a12085bd6c9490f0)

Usage instructions

FAST-SG.pl is the wrapper script used to run FastSG++.

Mandatory arguments

FAST-SG requires 4 mandatory arguments:

1.- The k-mer size (-k) restricted to the range [12-256]

2.- The set of contig sequences (-r) in FASTA format.

3.- The output prefix (-p)

4.- The read configuration file (-r) having the following format (space separated):

Short reads:

#type libID Path(fwd) Path(rev) SAM(1:single 2:paired)	

short lib1 example/ecoli.ill-sim.fwd.fq.gz example/ecoli.ill-sim.rev.fq.gz 1

Long reads:

#type libID Path(long read) Insert-sizes SAM(1:single 2:paired)	

long pac example/ECOLI-PACSEQ.subset.fasta.gz 1000,2000,3000,5000 1

long ont example/ECOLI-ONT-1D.subset.fasta.gz 1000,2000,3000,5000 1

Example of a read configuration file can be found in examples/ecoli-reads.txt

Running Fast-SG (example):

Using a single k-mer

./FAST-SG.pl  -k 15 -l example/ecoli-reads.txt -r example/ecoli-illumina.fa.gz -p test

Using a range of k-mers

./FAST-SG.pl  -k 15-25:5 -l example/ecoli-reads.txt -r example/ecoli-illumina.fa.gz -p test

Help and additional options

./FAST-SG.pl  --help

Hybrid assembly of NA12878

The wiki page provide a full example of the use of Fast-SG for the hybrid assembly of NA12878.

Licence

Fast-SG software distributed under MIT licence

Citation

Alex Di Genova, Gonzalo A Ruz, Marie-France Sagot, Alejandro Maass; Fast-SG: an alignment-free algorithm for hybrid assembly, GigaScience, Volume 7, Issue 5, 1 May 2018, giy048, link

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].