All Projects → OpenGene → GeneFuse

OpenGene / GeneFuse

Licence: MIT license
Gene fusion detection and visualization

Programming Languages

c
50402 projects - #5 most used programming language
C++
36643 projects - #6 most used programming language

Projects that are alternatives of or similar to GeneFuse

ci4cc-informatics-resources
Community-maintained list of resources that the CI4CC organization and the larger cancer informatics community have found useful or are developing.
Stars: ✭ 22 (-75.56%)
Mutual labels:  cancer
histopathologic cancer detector
CNN histopathologic tumor identifier.
Stars: ✭ 26 (-71.11%)
Mutual labels:  cancer
FDIPs
Fusion open source community (FOSC) improvement proposals
Stars: ✭ 54 (-40%)
Mutual labels:  fusion
RGB-Fusion-Tool-PS
Powershell that use RGB Fusion CLI to associate profiles with Windows Processes
Stars: ✭ 30 (-66.67%)
Mutual labels:  fusion
MCF-3D-CNN
Temporal-spatial Feature Learning of DCE-MR Images via 3DCNN
Stars: ✭ 43 (-52.22%)
Mutual labels:  fusion
searchhub
Fusion demo app searching open-source project data from the Apache Software Foundation
Stars: ✭ 42 (-53.33%)
Mutual labels:  fusion
rtfmbot
Because we're all tired of answering questions when people should clearly RTFM.
Stars: ✭ 14 (-84.44%)
Mutual labels:  fusion
lucidworks-view
Create custom user experiences for your Fusion-powered apps.
Stars: ✭ 40 (-55.56%)
Mutual labels:  fusion
SiamFusion
No description or website provided.
Stars: ✭ 26 (-71.11%)
Mutual labels:  fusion
Aurora
Modern toolbox for impurity transport, neutrals and radiation modeling in magnetically-confined plasmas
Stars: ✭ 18 (-80%)
Mutual labels:  fusion
arriba
Fast and accurate gene fusion detection from RNA-Seq data
Stars: ✭ 162 (+80%)
Mutual labels:  cancer
civic-server
Backend Server for CIViC Project
Stars: ✭ 39 (-56.67%)
Mutual labels:  cancer
fusion-components
A collection of React Components built with Emotion.js
Stars: ✭ 13 (-85.56%)
Mutual labels:  fusion
SlicerRadiomics
A Slicer extension to provide a GUI around pyradiomics
Stars: ✭ 83 (-7.78%)
Mutual labels:  cancer
neutronics-workshop
A workshop covering a range of fusion relevant analysis and simulations with OpenMC, DAGMC, Paramak and other open source fusion neutronics tools
Stars: ✭ 29 (-67.78%)
Mutual labels:  fusion
cloudstackOps
Handy tools to operate a CloudStack cloud
Stars: ✭ 47 (-47.78%)
Mutual labels:  cosmic
cancer-data
TCGA data acquisition and processing for Project Cognoma
Stars: ✭ 17 (-81.11%)
Mutual labels:  cancer
imu ekf
6-axis(3-axis acceleration sensor+3-axis gyro sensor) IMU fusion with Extended Kalman Filter.
Stars: ✭ 56 (-37.78%)
Mutual labels:  fusion
Deep-Blind-Hyperspectral-Image-Fusion
This repository is the official code for DBIN (ICCV 2019) and EDBIN (TNNLS 2021)
Stars: ✭ 18 (-80%)
Mutual labels:  fusion
haystack bio
Haystack: Epigenetic Variability and Transcription Factor Motifs Analysis Pipeline
Stars: ✭ 42 (-53.33%)
Mutual labels:  gene

install with conda

GeneFuse

A tool to detect and visualize target gene fusions by scanning FASTQ files directly. This tool accepts FASTQ files and reference genome as input, and outputs detected fusion results in TEXT, JSON and HTML formats.

Take a quick glance of the informative report

Get genefuse program

install with Bioconda

install with conda

conda install -c bioconda genefuse

download binary

This binary is only for Linux systems, http://opengene.org/GeneFuse/genefuse

# this binary was compiled on CentOS, and tested on CentOS/Ubuntu
wget http://opengene.org/GeneFuse/genefuse
chmod a+x ./genefuse

or compile from source

# get source (you can also use browser to download from master or releases)
git clone https://github.com/OpenGene/genefuse.git

# build
cd genefuse
make

# Install
sudo make install

Usage

You should provide following arguments to run genefuse

  • the reference genome fasta file, specified by -r or --ref=
  • the fusion setting file, specified by -f or --fusion=
  • the fastq file(s), specified by -1 or --read1= for single-end data. If dealing with pair-end data, specify the read2 file by -2 or --read2=
  • use -h or --html= to specify the file name of HTML report
  • use -j or --json= to specify the file name of JSON report
  • the plain text result is directly printed to STDOUT, you can pipe it to a file using a >

Example

genefuse -r hg19.fasta -f genes/druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html > result

Reference genome

The reference genome should be a single whole FASTA file containg all chromosome data. This file shouldn't be compressed. For human data, typicall hg19/GRch37 or hg38/GRch38 assembly is used, which can be downloaded from following sites:

Fusion file

The fusion file is a list of coordinated target genes together with their exons. A sample is:

>EML4_ENST00000318522.5,chr2:42396490-42559688
1,42396490,42396776
2,42472645,42472827
3,42483641,42483770
4,42488261,42488434
5,42490318,42490446
...

>ALK_ENST00000389048.3,chr2:29415640-30144432
1,30142859,30144432
2,29940444,29940563
3,29917716,29917880
4,29754781,29754982
5,29606598,29606725
...

The coordination system should be consistent with the reference genome.

Fusion files provided in this package

Four fusion files are provided with genefuse:

  1. genes/druggable.hg19.csv: all druggable fusion genes based on hg19/GRch37 reference assembly.
  2. genes/druggable.hg38.csv: all druggable fusion genes based on hg38/GRch38 reference assembly.
  3. genes/cancer.hg19.csv: all COSMIC curated fusion genes (http://cancer.sanger.ac.uk/cosmic/fusion) based on hg19/GRch37 reference assembly.
  4. genes/cancer.hg38.csv: all COSMIC curated fusion genes (http://cancer.sanger.ac.uk/cosmic/fusion) based on hg38/GRch38 reference assembly.

Notes:

  • genefuse runs much faster with druggable genes than cancer genes, since druggable genes are only a small subset of cancer genes. Use this one if you only care about the fusion related personalized medicine for cancers.
  • The cancer genes should be enough for most cancer related studies, since all COSMIC curated fusion genes are included.
  • If you want to create a custom gene list, please follow the instructions given on next section.

Create a fusion file based on hg19 or hg38

If you'd like to create a custom fusion file, you can use scripts/make_fusion_genes.py
As the script uses refFlat.txt file to determine genomic coordinates of exons, you need to download a refFlat.txt file from UCSC Genome Browser in advance. Of course, the choice of using either hg19 or hg38 is up to you.

Please make sure unzip the file to txt format before you continue

As for the input gene list file, all genes should be listed in separate lines. By default, the longest transcript will be used. However, you can specify a different transcript by adding the transcript ID to the end of a gene. The gene and its transcript should be separated by a tab or a space. Please note that each gene should be the HGNC official gene symbol, and each transcript should be NCBI RefSeq transcript ID.

An example of gene list file:

BRCA2	NM_000059
FAM155A
IRS2

When both input gene list file (gene_list.txt) and refFlat.txt file are prepared, you can use following command to generate a user-defined fusion file (fusion.csv):

python3 scripts/make_fusion_genes.py gene_list.txt -r /path/to/refflat -o fusion.csv

HTML report

GeneFuse can generate very informative and interactive HTML pages to visualize the fusions with following information:

  • the fusion genes, along with their transcripts.
  • the inferred break point with reference genome coordinations.
  • the inferred fusion protein, with all exons and the transcription direction.
  • the supporting reads, with all bases colorized according to their quality scores.
  • the number of supporting reads, and how many of them are unique (the rest may be duplications)

A HTML report example

image
See the HTML page of this picture: http://opengene.org/GeneFuse/report.html

All options

options:
  -1, --read1       read1 file name (string)
  -2, --read2       read2 file name (string [=])
  -f, --fusion      fusion file name, in CSV format (string)
  -r, --ref         reference fasta file name (string)
  -u, --unique      specify the least supporting read number is required to report a fusion, default is 2 (int [=2])
  -d, --deletion    specify the least deletion length of a intra-gene deletion to report, default is 50 (int [=50])
  -h, --html        file name to store HTML report, default is genefuse.html (string [=genefuse.html])
  -j, --json        file name to store JSON report, default is genefuse.json (string [=genefuse.json])
  -t, --thread      worker thread number, default is 4 (int [=4])
  -?, --help        print this message

Cite GeneFuse

If you used GeneFuse in you work, you can cite it as:

Shifu Chen, Ming Liu, Tanxiao Huang, Wenting Liao, Mingyan Xu and Jia Gu. GeneFuse: detection and visualization of target gene fusions from DNA sequencing data. International Journal of Biological Sciences, 2018; 14(8): 843-848. doi: 10.7150/ijbs.24626

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].