All Projects → LiuLabUB → HMMRATAC

LiuLabUB / HMMRATAC

Licence: GPL-3.0 license
HMMRATAC peak caller for ATAC-seq data

Programming Languages

java
68154 projects - #9 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to HMMRATAC

ATACseq
Analysis Workflow for Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq)
Stars: ✭ 51 (-40.7%)
Mutual labels:  sequencing, atac-seq, peak-detection
Otto
Sampler, Sequencer, Multi-engine synth and effects - in a box! [WIP]
Stars: ✭ 2,390 (+2679.07%)
Mutual labels:  sequencing
Genomicsqlite
Genomics Extension for SQLite
Stars: ✭ 90 (+4.65%)
Mutual labels:  sequencing
Rnaseq Workflow
A repository for setting up a RNAseq workflow
Stars: ✭ 170 (+97.67%)
Mutual labels:  sequencing
Genomics
A collection of scripts and notes related to genomics and bioinformatics
Stars: ✭ 101 (+17.44%)
Mutual labels:  sequencing
Snapatac
Analysis Pipeline for Single Cell ATAC-seq
Stars: ✭ 183 (+112.79%)
Mutual labels:  sequencing
Truvari
Structural variant toolkit for VCFs
Stars: ✭ 85 (-1.16%)
Mutual labels:  sequencing
indigo
Indigo: SNV and InDel Discovery in Chromatogram traces obtained from Sanger sequencing of PCR products
Stars: ✭ 26 (-69.77%)
Mutual labels:  sequencing
Sequenceserver
Intuitive local web frontend for the BLAST bioinformatics tool
Stars: ✭ 198 (+130.23%)
Mutual labels:  sequencing
Afterqc
Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data
Stars: ✭ 169 (+96.51%)
Mutual labels:  sequencing
Hgvs
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
Stars: ✭ 138 (+60.47%)
Mutual labels:  sequencing
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (+30.23%)
Mutual labels:  sequencing
Shasta
De novo assembly from Oxford Nanopore reads.
Stars: ✭ 188 (+118.6%)
Mutual labels:  sequencing
Ariba
Antimicrobial Resistance Identification By Assembly
Stars: ✭ 96 (+11.63%)
Mutual labels:  sequencing
Isobar
A Python library for creating and manipulating musical patterns, designed for use in algorithmic composition, generative music and sonification. Can be used to generate MIDI events, MIDI files, OSC messages, or custom events.
Stars: ✭ 207 (+140.7%)
Mutual labels:  sequencing
Beet.js
Polyrhythmic Sequencer library for Web Audio API.
Stars: ✭ 87 (+1.16%)
Mutual labels:  sequencing
Artemis
Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation
Stars: ✭ 135 (+56.98%)
Mutual labels:  sequencing
Roary
Rapid large-scale prokaryote pan genome analysis
Stars: ✭ 176 (+104.65%)
Mutual labels:  sequencing
HLA
xHLA: Fast and accurate HLA typing from short read sequence data
Stars: ✭ 84 (-2.33%)
Mutual labels:  sequencing
Htsjdk
A Java API for high-throughput sequencing data (HTS) formats.
Stars: ✭ 220 (+155.81%)
Mutual labels:  sequencing

HMMRATAC

Quick Start

Assume that you have a BAM file from aligner such as bwa mem named ATACseq.bam.

  1. Sort the BAM file to get a ATACseq.sorted.bam file:

    samtools sort ATACseq.bam -o ATACseq.sorted.bam

  2. Make index from the BAM file to get a ATACseq.sorted.bam.bai file:

    samtools index ATACseq.sorted.bam ATACseq.sorted.bam.bai

  3. Make genome information (chromosome sizes) from the BAM file to get a genome.info file:

    samtools view -H ATACseq.sorted.bam| perl -ne 'if(/^@SQ.*?SN:(\w+)\s+LN:(\d+)/){print $1,"\t",$2,"\n"}' > genome.info

  4. Run HMMRATAC on the sorted BAM ATACseq.sorted.bam, the BAM index file ATACseq.sorted.bam.bai, and the genome information file genome.info:

    java -jar HMMRATAC_V1.2.4_exe.jar -b ATACseq.sorted.bam -i ATACseq.sorted.bam.bai -g genome.info

  5. Filter HMMRATAC output by the score, if desired. Score threshold will depend on dataset, score type and user preference. A threshold of 10 would be:

    awk -v OFS="\t" '$13>=10 {print}' NAME_peaks.gappedPeak > NAME.filteredPeaks.gappedPeak

    To filter the summit file by the same threshold:

    awk -v OFS="\t" '$5>=10 {print}' NAME_summits.bed > NAME.filteredSummits.bed

    NOTE: HMMRATAC will report all peaks that match the structure defined by the model, including weak peaks. Filtering by score can be used to retain stronger peaks. Lower score = higher sensitivity and lower precision, Higher score = lower sensitivity and higher precision.

Samtools can be downloaded here: http://www.htslib.org/download/

Be sure to run HMMRATAC using the executable file, found here: https://github.com/LiuLabUB/HMMRATAC/releases For details on HOW to run HMMRATAC, see HMMRATAC_Guide.md, which contains a thorough runthrough of all parameters, output files and input requirements and troubleshooting.

HMMRATAC requires paired-end data. Single-end data will not work. HMMRATAC is designed to process ATAC-seq data that hasn't undergone any size selection, either physical or in silico. This should be standard practice for any ATAC-seq analysis. Size selected data can be processed by HMMRATAC (see HMMRATAC_Guide.md on --trim option).

If you use HMMRATAC in your research, please cite the following paper:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].