All Projects → lh3 → Etrf

lh3 / Etrf

Exact Tandem Repeat Finder (not a TRF replacement)

Programming Languages

c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Etrf

Pretzel
Javascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-25.71%)
Mutual labels:  bioinformatics
Vdjviz
A lightweight immune repertoire browser
Stars: ✭ 21 (-40%)
Mutual labels:  bioinformatics
Protr
Comprehensive toolkit for generating various numerical features of protein sequences
Stars: ✭ 30 (-14.29%)
Mutual labels:  bioinformatics
Nonpareil
Estimate metagenomic coverage and sequence diversity
Stars: ✭ 26 (-25.71%)
Mutual labels:  bioinformatics
Awesome Sequencing Tech Papers
A collection of publications on comparison of high-throughput sequencing technologies.
Stars: ✭ 21 (-40%)
Mutual labels:  bioinformatics
Workshop
课题组每周研讨会
Stars: ✭ 28 (-20%)
Mutual labels:  bioinformatics
Metacache
memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
Stars: ✭ 26 (-25.71%)
Mutual labels:  bioinformatics
Bwa
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
Stars: ✭ 970 (+2671.43%)
Mutual labels:  bioinformatics
Minimap2
A versatile pairwise aligner for genomic and spliced nucleotide sequences
Stars: ✭ 912 (+2505.71%)
Mutual labels:  bioinformatics
Cytometry Clustering Comparison
R scripts to reproduce analyses in our paper comparing clustering methods for high-dimensional cytometry data
Stars: ✭ 30 (-14.29%)
Mutual labels:  bioinformatics
Scispacy
A full spaCy pipeline and models for scientific/biomedical documents.
Stars: ✭ 855 (+2342.86%)
Mutual labels:  bioinformatics
Uncurl python
UNCURL is a tool for single cell RNA-seq data analysis.
Stars: ✭ 13 (-62.86%)
Mutual labels:  bioinformatics
Rasusa
Randomly subsample sequencing reads to a specified coverage
Stars: ✭ 28 (-20%)
Mutual labels:  bioinformatics
Taxadb
🐣 locally query the ncbi taxonomy
Stars: ✭ 26 (-25.71%)
Mutual labels:  bioinformatics
Fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Stars: ✭ 966 (+2660%)
Mutual labels:  bioinformatics
16gt
Simultaneous detection of SNPs and Indels using a 16-genotype probabilistic model
Stars: ✭ 26 (-25.71%)
Mutual labels:  bioinformatics
Sevenbridges R
Seven Bridges API Client, CWL Schema, Meta Schema, and SDK Helper in R
Stars: ✭ 27 (-22.86%)
Mutual labels:  bioinformatics
Genevalidator
GeneValidator: Identify problems with predicted genes
Stars: ✭ 34 (-2.86%)
Mutual labels:  bioinformatics
Metasra Pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Stars: ✭ 33 (-5.71%)
Mutual labels:  bioinformatics
Sv Callers
Snakemake-based workflow for detecting structural variants in WGS data
Stars: ✭ 28 (-20%)
Mutual labels:  bioinformatics

Etrf is a simple tool to find exact tandem repeats (i.e. without mismatches or gaps in the repeat unit) in DNA sequences. It only has two parameters: the maximum repeat unit length and the minimum total repeat length. For each unit length, etrf scans an input sequence and obtains a list of non-overlapping regions no less than twice of the unit length. For two overlapping regions identified with different unit lengths, etrf chooses the longer one, or the one found with the shorter unit length if the two regions are of equal length.

Unable to find impure tandem repeats, etrf doesn't replace more sophisticated tools such as TRF or ULTRA. Nonetheless, because etrf implements an exact algorithm, it avoids ambiguity in the definition of repeats and its behavior is predicable. Etrf is also faster. It can process a human genome in 15 minutes on a single CPU thread.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].