All Projects → slowkow → CENTIPEDE.tutorial

slowkow / CENTIPEDE.tutorial

Licence: other
🐛 How to use CENTIPEDE to determine if a transcription factor is bound.

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to CENTIPEDE.tutorial

tftargets
🎯 Human transcription factor target genes.
Stars: ✭ 77 (+234.78%)
Mutual labels:  bioinformatics, transcription-factors, rstats
snpsea
📊 Identify cell types and pathways affected by genetic risk loci.
Stars: ✭ 26 (+13.04%)
Mutual labels:  bioinformatics, enrichment
Liger
Lightweight Iterative Gene set Enrichment in R
Stars: ✭ 44 (+91.3%)
Mutual labels:  bioinformatics, rstats
reg-gen
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
Stars: ✭ 64 (+178.26%)
Mutual labels:  bioinformatics, dnase-seq
Karyoploter
karyoploteR - An R/Bioconductor package to plot arbitrary data along the genome
Stars: ✭ 192 (+734.78%)
Mutual labels:  bioinformatics, rstats
homerkit
Read HOMER motif analysis output in R.
Stars: ✭ 13 (-43.48%)
Mutual labels:  enrichment, transcription-factors
Dash
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+67691.3%)
Mutual labels:  bioinformatics, rstats
wdlRunR
Elastic, reproducible, and reusable genomic data science tools from R backed by cloud resources
Stars: ✭ 34 (+47.83%)
Mutual labels:  bioinformatics, rstats
MetaOmGraph
MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets
Stars: ✭ 30 (+30.43%)
Mutual labels:  bioinformatics
gene-oracle
Feature extraction algorithm for genomic data
Stars: ✭ 13 (-43.48%)
Mutual labels:  bioinformatics
agent
Store sensitive data such as API tokens
Stars: ✭ 19 (-17.39%)
Mutual labels:  rstats
atropos
An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Stars: ✭ 109 (+373.91%)
Mutual labels:  bioinformatics
CoNekT
CoNekT (short for Co-expression Network Toolkit) is a platform to browse co-expression data and enable cross-species comparisons.
Stars: ✭ 17 (-26.09%)
Mutual labels:  bioinformatics
flowmapblue.R
Flowmap.blue widget for R
Stars: ✭ 42 (+82.61%)
Mutual labels:  rstats
rLandsat
R Package to make Landsat8 data accessible
Stars: ✭ 95 (+313.04%)
Mutual labels:  rstats
heddlr
Bring a functional programming mindset to R Markdown document generation
Stars: ✭ 14 (-39.13%)
Mutual labels:  rstats
geoparser
⛔ ARCHIVED ⛔ R package for the Geoparser.io API
Stars: ✭ 38 (+65.22%)
Mutual labels:  rstats
hotmap
WebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-43.48%)
Mutual labels:  bioinformatics
CAFE5
Version 5 of the CAFE phylogenetics software
Stars: ✭ 53 (+130.43%)
Mutual labels:  bioinformatics
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (+139.13%)
Mutual labels:  bioinformatics

CENTIPEDE Tutorial

CENTIPEDE fits a bayesian hierarchical mixture model to learn TF-specific distribution of experimental data on a particular cell-type for a set of candidate binding sites described by a motif.

This is a practical tutorial for running CENTIPEDE with DNase-Seq data. It explains how to prepare the data and how to run the analysis. The goal is to predict if a putative transcription factor binding site is actually bound or not. For details about the statistical models underlying the methods, please see (Pique-Regi, et al. 2011).

Read the tutorial online or download the PDF:

This repository has functions to ease the use of CENTIPEDE:

  • centipede_data() converts data to the format required for CENTIPEDE.
  • parse_region() parses a string like "chr1:123-456".
  • read_bedGraph() reads a bedGraph file with 4 columns: chrom, start, end, score.
  • read_fimo() reads a text file output by FIMO and selects sites that meet a significance threshold.

I also provide example data that you can use to follow the tutorial:

  • cen is a list with two items:
    • cen$mat is a matrix of read-start counts for 3,337 genomic regions.
    • cen$regions is a dataframe describing those regions.
  • site_cons is a vector with mean conservation scores for the 3,337 regions, computed across 100 vertebrates.

Installation

Install CENTIPEDE by running this in your shell (not within an R session):

wget http://download.r-forge.r-project.org/src/contrib/CENTIPEDE_1.2.tar.gz
R CMD INSTALL CENTIPEDE_1.2.tar.gz

Next, install the tutorial package:

# This command didn't work for me.
# install.packages("CENTIPEDE", repos="http://R-Forge.R-project.org")

install.packages("devtools")
devtools::install_github("slowkow/CENTIPEDE.tutorial")
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].