All Projects → vivekbhr → Subread_to_DEXSeq

vivekbhr / Subread_to_DEXSeq

Licence: GPL-3.0 license
Scripts to import your FeatureCounts output into DEXSeq

Programming Languages

python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to Subread to DEXSeq

CoNekT
CoNekT (short for Co-expression Network Toolkit) is a platform to browse co-expression data and enable cross-species comparisons.
Stars: ✭ 17 (-26.09%)
Mutual labels:  rna-seq
CellNet
CellNet: network biology applied to stem cell engineering
Stars: ✭ 39 (+69.57%)
Mutual labels:  rna-seq
Protocols-4pub
Multi-omics analysis protocols by Lyu.
Stars: ✭ 37 (+60.87%)
Mutual labels:  rna-seq
cellSNP
Pileup biallelic SNPs from single-cell and bulk RNA-seq data
Stars: ✭ 42 (+82.61%)
Mutual labels:  rna-seq
RNASeq
RNASeq pipeline
Stars: ✭ 30 (+30.43%)
Mutual labels:  rna-seq
ORNA
Fast in-silico normalization algorithm for NGS data
Stars: ✭ 21 (-8.7%)
Mutual labels:  rna-seq
alevin-fry
🐟 🔬🦀 alevin-fry is an efficient and flexible tool for processing single-cell sequencing data, currently focused on single-cell transcriptomics and feature barcoding.
Stars: ✭ 78 (+239.13%)
Mutual labels:  rna-seq
kana
Single cell analysis in the browser
Stars: ✭ 81 (+252.17%)
Mutual labels:  rna-seq
MINTIE
Method for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.
Stars: ✭ 24 (+4.35%)
Mutual labels:  rna-seq
scaden
Deep Learning based cell composition analysis with Scaden.
Stars: ✭ 61 (+165.22%)
Mutual labels:  rna-seq
picardmetrics
🚦 Run Picard on BAM files and collate 90 metrics into one file.
Stars: ✭ 38 (+65.22%)
Mutual labels:  rna-seq
CellO
CellO: Gene expression-based hierarchical cell type classification using the Cell Ontology
Stars: ✭ 34 (+47.83%)
Mutual labels:  rna-seq
poreplex
A versatile sequenced read processor for nanopore direct RNA sequencing
Stars: ✭ 74 (+221.74%)
Mutual labels:  rna-seq
dropClust
Version 2.1.0 released
Stars: ✭ 19 (-17.39%)
Mutual labels:  rna-seq
TCC-GUI
📊 Graphical User Interface for TCC package
Stars: ✭ 35 (+52.17%)
Mutual labels:  rna-seq
gene-oracle
Feature extraction algorithm for genomic data
Stars: ✭ 13 (-43.48%)
Mutual labels:  rna-seq
NGS
Next-Gen Sequencing tools from the Horvath Lab
Stars: ✭ 30 (+30.43%)
Mutual labels:  rna-seq
rna-seq-kallisto-sleuth
A Snakemake workflow for differential expression analysis of RNA-seq data with Kallisto and Sleuth.
Stars: ✭ 56 (+143.48%)
Mutual labels:  rna-seq
grape-nf
An automated RNA-seq pipeline using Nextflow
Stars: ✭ 30 (+30.43%)
Mutual labels:  rna-seq
ngs-in-bioc
A course on Analysing Next Generation (/High Throughput etc..) Sequencing data using Bioconductor
Stars: ✭ 37 (+60.87%)
Mutual labels:  rna-seq

Subread_to_DEXSeq

Vivek Bhardwaj

These functions provide a way to use featurecounts output for DEXSeq

The directory contains two scripts:

  1. dexseq_prepare_annotation2.py : It's same as the "dexseq_prepare_annotation.py" that comes with DEXSeq, but with an added option to output featureCounts-readable GTF file.

  2. load_SubreadOutput.R : Provides a function "DEXSeqDataSetFromFeatureCounts", to load the output of featureCounts as a dexSeq dataset (dxd) object.

Usage example

1) Prepare annotation

Syntax :

python dexseq_prepare_annotation2.py -f <featurecounts.gtf> <input.gtf> <dexseq_counts.gff>

Example :

python dexseq_prepare_annotation2.py -f dm6_ens76_flat.gtf dm6_ens76.gtf dm6_ens76_flat.gff

you will get a file "dm6_ens76_flat.gff" and another "dm6_ens76_flat.gtf" (for featurecounts)

2) Count using Subread (command line)

We use the -f options to count reads overlapping features.

We can use the -O option to count the reads overlapping to multiple exons (similar to DEXSeq_count).

/path/to/subread/bin/featureCounts -f -O -s 2 -p -T 40 \
-F GTF -a dm6_ens76_flat.gtf \
-o dm6_fCount.out Cont_1.bam Cont_2.bam Test_1.bam Test_2.bam

3) load into DEXSeq**

This script requires dplyr, and DEXSeq installed in your R..

In R prepare a sampleData data.frame, which contains sample names used for featurecounts as rownames, plus condition, and other variables you want to use for DEXSeq design matrix.

Example :

source("load_SubreadOutput.R")
samp <- data.frame(row.names = c("cont_1","cont_2","test_1","test_2"), 
                        condition = rep(c("control","trt"),each=2))
dxd.fc <- DEXSeqDataSetFromFeatureCounts("dm6_fCount.out",
                                         flattenedfile = "dm6_ens76_flat.gtf",sampleData = samp)

This will create a dxd object that you can use for DEXSeq analysis.

Results

On a real dataset from drosophila (mapped to dm6). I compared the output from featurecounts (two modes) and DEXSeq_Counts.

In unique mode, fragments overlapping multiple features are not counted, while in multi mode, they are counted.

Dispersion Estimates

Results

Number of differentially expressed exons with 10% FDR. The output from featurecounts is highly similar to DEXSeq_Count, when we count the multi-feature overlapping reads (-O option).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].