All Projects → GreenleafLab → motifmatchr

GreenleafLab / motifmatchr

Licence: GPL-3.0 License
Fast motif matching in R

Programming Languages

C++
36643 projects - #6 most used programming language
r
7636 projects

Projects that are alternatives of or similar to motifmatchr

bio tools
Useful bioinformatic scripts
Stars: ✭ 35 (+40%)
Mutual labels:  bioinformatics
reg-gen
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
Stars: ✭ 64 (+156%)
Mutual labels:  bioinformatics
block-aligner
SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Stars: ✭ 58 (+132%)
Mutual labels:  bioinformatics
GenomicDataCommons
Provide R access to the NCI Genomic Data Commons portal.
Stars: ✭ 64 (+156%)
Mutual labels:  bioinformatics
BioKEEN
A computational library for learning and evaluating biological knowledge graph embeddings
Stars: ✭ 41 (+64%)
Mutual labels:  bioinformatics
chromap
Fast alignment and preprocessing of chromatin profiles
Stars: ✭ 93 (+272%)
Mutual labels:  bioinformatics
ngstools
My own tools code for NGS data analysis (Next Generation Sequencing)
Stars: ✭ 28 (+12%)
Mutual labels:  bioinformatics
react-msa-viewer
React rerelease of MSAViewer
Stars: ✭ 15 (-40%)
Mutual labels:  bioinformatics
geneview
Genomics data visualization in Python by using matplotlib.
Stars: ✭ 38 (+52%)
Mutual labels:  bioinformatics
netSmooth
netSmooth: A Network smoothing based method for Single Cell RNA-seq imputation
Stars: ✭ 23 (-8%)
Mutual labels:  bioinformatics
plasmidtron
Assembling the cause of phenotypes and genotypes from NGS data
Stars: ✭ 27 (+8%)
Mutual labels:  bioinformatics
flexidot
Highly customizable, ambiguity-aware dotplots for visual sequence analyses
Stars: ✭ 73 (+192%)
Mutual labels:  bioinformatics
epiviz
EpiViz is a scientific information visualization tool for genetic and epigenetic data, used to aid in the exploration and understanding of correlations between various genome features.
Stars: ✭ 65 (+160%)
Mutual labels:  bioinformatics
StackedDAE
Stacked Denoising AutoEncoder based on TensorFlow
Stars: ✭ 23 (-8%)
Mutual labels:  bioinformatics
TypeTE
Genotyping of segregating mobile elements insertions
Stars: ✭ 15 (-40%)
Mutual labels:  bioinformatics
SumStatsRehab
GWAS summary statistics files QC tool
Stars: ✭ 19 (-24%)
Mutual labels:  bioinformatics
perbase
Per-base per-nucleotide depth analysis
Stars: ✭ 46 (+84%)
Mutual labels:  bioinformatics
pathway-mapper
PathwayMapper: An interactive and collaborative graphical curation tool for cancer pathways
Stars: ✭ 47 (+88%)
Mutual labels:  bioinformatics
awesome-genetics
A curated list of awesome bioinformatics software.
Stars: ✭ 60 (+140%)
Mutual labels:  bioinformatics
crazydoc
Read DNA sequences from colourful Microsoft Word documents
Stars: ✭ 18 (-28%)
Mutual labels:  bioinformatics

motifmatchr

Build Status

Introduction

motifmatchr is an R package for fast motif matching, using C++ code from the MOODS library. The MOODS library was developed by Pasi Rastas, Janne Korhonen, and Petri Martinmäki. The core C++ library from MOODs version MOODS 1.9.3 code has been included in this repository.

Note on recent function name changes

The motifmatchr package recently changed to switch over to camelCase from snake_case. All exported functions now use camelCase, e.g. match_motifs is now matchMotifs. If following the current documentation but using an earlier version of the package, either update the package or be aware of the discrepancy. This change was made to comply with Bioconductor naming preferences.

Installation

Installation is easiest using the devtools package. The function install_github will install the package.

devtools::install_github("GreenleafLab/motifmatchr")

A number of needed packages are installed in this process. One of the dependencies has a system requirement for the gsl library, so if this is not installed already it may need to be installed separately.

matchMotifs

The primary method of motifmatchr is matchMotifs. This method has two mandatory arguments:

  1. Position weight matrices or position frequency matrices, stored in the PWMatrix, PFMatrix, PWMatrixList, or PFMatrixList objects from the TFBSTools package

  2. Either a set of genomic ranges (GenomicRanges or RangedSummarizedExperiment object) or a set of sequences (either DNAStringSet, DNAString, or simple character vector)

If the second argument is a set of genomic ranges, a genome sequence is also required. If the genomic ranges include seqinfo, by default the genome specified in the seqinfo will be used (if the relevant BSgenome package is installed). Otherwise you can supply either a short string specifying the genome build if the corresponding BSgenome object is installed, a BSgenone object, a DNAStringSet object, or a FaFile object pointint to a fasta file.

The method can return three possible outputs, depending on the out argument:

  1. (Default, with out = "matches") Boolean matrix indicating which ranges/sequences contain which motifs, stored as "matches" in assays slot of SummarizedExperiment object

  2. (out = "scores") Same as (1) plus two additional assays -- a matrix with the score of the high motif score within each range/sequence (score only reported if match present) and a matrix with the number of motif matches.

  3. (out = "positions") A GenomicRangesList with the ranges of all matches within the input ranges/sequences.

Quickstart

library(motifmatchr)
library(GenomicRanges)

# load some example motifs
data(example_motifs, package = "motifmatchr") 

# Make a set of peaks
peaks <- GRanges(seqnames = c("chr1","chr2","chr2"),
                 ranges = IRanges(start = c(76585873,42772928,100183786),
                                  width = 500))

# Get motif matches for example motifs in peaks
motif_ix <- matchMotifs(example_motifs, peaks, genome = "hg19") 
motifMatches(motif_ix) # Extract matches matrix from SummarizedExperiment result

# Get motif positions within peaks for example motifs in peaks 
motif_ix <- matchMotifs(example_motifs, peaks, genome = "hg19",
                         out = "positions") 

More information

For a more detailed overview, see vignette.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].