All Projects → bioturing → hera-t

bioturing / hera-t

Licence: other
Hera-T, a fast and accurate tool for estimating gene abundances in single cell data generated by the 10X-Chromium protocol

Programming Languages

c
50402 projects - #5 most used programming language
assembly
5116 projects
Ada
118 projects
C++
36643 projects - #6 most used programming language
pascal
1382 projects
C#
18002 projects

Hera-T

We introduce Hera-T, a fast and accurate tool for estimating gene abundances in single cell data generated by the 10X-Chromium protocol. By devising a new strategy for aligning reads to both transcriptome and genome references, Hera-T reduces both running time and memory consumption from 10 to 100 folds while giving similar results compared to CellRanger’s. Hera-T also addresses some difficult splicing alignment scenarios that CellRanger fails to address, and therefore, obtains better accuracy compared to CellRanger. Excluding the reads in those scenarios, Hera-T and CellRanger results have correlation scores > 0.99.

License

Hera-T is distributed under BioTuring License. See the LICENSE file for details.

Pre-built indexes

Install

sh ./build.sh

Usage

Usage: ./hera-T count [options] -x <idx_name> -1 <R1> -2 <R2>
Option:
-t	: Number of threads
-o	: Output directory name
-p	: Output file prefix
-l	: Library types
		0: 10X-Chromium 3' (v2) protocol
		1: 10X-Chromium 3' (v3) protocol
Example: ./hera-T count -t 32 -o ./result -x index/grch37 -l 0 -1 lane_0.read_1.fq lane_1.read_1.fq -2 lane_0.read_2.fq lane_1.read_2.fq

Example run

1k Brain Cells from an E18 Mouse (v2 chemistry)

Download link: http://cf.10xgenomics.com/samples/cell-exp/3.0.0/neuron_1k_v2/neuron_1k_v2_fastqs.tar

~ » ls -lah cr_mm10_210/*
-rw-rw-r--@ 1 bioturing  staff   2.5G Nov 14  2018 cr_mm10_210/cr_mm10_210.bwt
-rw-rw-r--@ 1 bioturing  staff   176M Nov 14  2018 cr_mm10_210/cr_mm10_210.fasta
-rw-rw-r--@ 1 bioturing  staff   1.8G Nov 14  2018 cr_mm10_210/cr_mm10_210.hash
-rw-rw-r--@ 1 bioturing  staff   862M Nov 14  2018 cr_mm10_210/cr_mm10_210.info
-rw-rw-r--@ 1 bioturing  staff   356B Nov 14  2018 cr_mm10_210/cr_mm10_210.log

~ » ./hera-T count -t 32 -o tmp -x cr_mm10_210/cr_mm10_210 \
		   -l 0 \
		   -1 neuron_1k_v2_fastqs/neuron_1k_v2_S1_L001_R1_001.fastq.gz \
		      neuron_1k_v2_fastqs/neuron_1k_v2_S1_L002_R1_001.fastq.gz \
		   -2 neuron_1k_v2_fastqs/neuron_1k_v2_S1_L001_R2_001.fastq.gz \
		      neuron_1k_v2_fastqs/neuron_1k_v2_S1_L002_R2_001.fastq.gz

Credit

Hera-T is developed and maintained in BioTuring INC. by:

Pre-print

Thang Tran, Thao Truong, Hy Vuong, Son Pham, “Hera-T: an efficient and accurate approach for quantifying gene abundances from 10X-Chromium data with high rates of non-exonic reads”, biorXiv, 2019 doi: https://doi.org/10.1101/530501

How to get help

A preferred way to report any problems or ask questions about Hera-T is the issue tracker. Before posting an issue/question, consider to look through the FAQs and existing issues (opened and closed) - it is possible that your question has already been answered.

If you reporting a problem, please include the HeraT.log file and provide some details about your dataset (if possible).

In case you prefer personal communication, please send an email to [email protected].

Change logs

2018-12-24 (0.1.2) (deprecated):

* Init repo

2018-12-25 (0.1.3) (deprecated):

* Add library types selection
* Write program description to matrix.mtx file

2018-12-27 (0.1.4) (deprecated):

* Fix memory leak in version 0.1.3

2018-12-28 (0.2.0) (deprecated):

* Support Chromium 3' v3 library

2019-03-20 (0.2.1) (release candidate):

* Fix random crash (change from buggy semaphore to lock)

2019-03-25 (0.2.2) (release candidate):

* Fix open all files at once
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].