All Projects → murphycj → Agfusion

murphycj / Agfusion

Licence: mit
Python package to annotate and visualize gene fusions.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Agfusion

civic-client
Web client for CIViC: Clinical Interpretations of Variants in Cancer
Stars: ✭ 49 (+36.11%)
Mutual labels:  cancer-genomics
Next
🦍 A configurable component library for web built on React.
Stars: ✭ 4,045 (+11136.11%)
Mutual labels:  fusion
Fusiondirect.jl
(No maintenance) Detect gene fusion directly from raw fastq files
Stars: ✭ 23 (-36.11%)
Mutual labels:  fusion
simsopt
Simons Stellarator Optimizer Code
Stars: ✭ 28 (-22.22%)
Mutual labels:  fusion
Multi sensor fusion
Multi-Sensor Fusion (GNSS, IMU, Camera) 多源多传感器融合定位 GPS/INS组合导航 PPP/INS紧组合
Stars: ✭ 357 (+891.67%)
Mutual labels:  fusion
Co Fusion
Co-Fusion: Real-time Segmentation, Tracking and Fusion of Multiple Objects
Stars: ✭ 400 (+1011.11%)
Mutual labels:  fusion
revolver
REVOLVER - Repeated Evolution in Cancer
Stars: ✭ 52 (+44.44%)
Mutual labels:  cancer-genomics
Sv Callers
Snakemake-based workflow for detecting structural variants in WGS data
Stars: ✭ 28 (-22.22%)
Mutual labels:  cancer-genomics
Cbioportal
cBioPortal for Cancer Genomics
Stars: ✭ 362 (+905.56%)
Mutual labels:  cancer-genomics
Fusion Core
Migrated to https://github.com/fusionjs/fusionjs
Stars: ✭ 647 (+1697.22%)
Mutual labels:  fusion
Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (+625%)
Mutual labels:  cancer-genomics
Formily
Alibaba Group Unified Form Solution -- Support React/ReactNative/Vue2/Vue3
Stars: ✭ 6,554 (+18105.56%)
Mutual labels:  fusion
Maskfusion
MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects
Stars: ✭ 404 (+1022.22%)
Mutual labels:  fusion
mobius
Mobius is an AI infra platform including realtime computing and training.
Stars: ✭ 22 (-38.89%)
Mutual labels:  fusion
Core
The core source repository for the Cherab project.
Stars: ✭ 26 (-27.78%)
Mutual labels:  fusion
SigProfilerExtractor
SigProfilerExtractor allows de novo extraction of mutational signatures from data generated in a matrix format. The tool identifies the number of operative mutational signatures, their activities in each sample, and the probability for each signature to cause a specific mutation type in a cancer sample. The tool makes use of SigProfilerMatrixGen…
Stars: ✭ 86 (+138.89%)
Mutual labels:  cancer-genomics
Fusion
🧰 A modern alternative to the Microsoft Assembly Binding Log Viewer (FUSLOGVW.exe)
Stars: ✭ 386 (+972.22%)
Mutual labels:  fusion
Awesome Autonomous Driving Papers
This repository provides awesome research papers for autonomous driving perception. If you do find a problem or have any suggestions, please raise this as an issue or make a pull request with information (format of the repo): Research paper title, datasets, metrics, objects, source code, publisher, and year.
Stars: ✭ 30 (-16.67%)
Mutual labels:  fusion
Fusionless
Python in Black Magic Design's Fusion that sucks less.
Stars: ✭ 12 (-66.67%)
Mutual labels:  fusion
Getting Started With Genomics Tools And Resources
Unix, R and python tools for genomics and data science
Stars: ✭ 587 (+1530.56%)
Mutual labels:  cancer-genomics

Annotate Gene Fusion (AGFusion)

Checkout the webapp: https://www.agfusion.app

AGFusion is a python package for annotating gene fusions from the human or mouse genomes. AGFusion simply needs the reference genome, the two gene partners, and the fusion junction coordinates as input, and outputs the following:

  • FASTA files of cDNA, CDS, and protein sequences.
  • Visualizes the protein domain and exon architectures of the fusion transcripts.
  • Saves tables listing the coordinates of protein features and exons included in the fusion.
  • Optional exon structure and protein domain visualization of the wild-type version of the fusion gene partners.

Some other things to know:

  • AGFusion automatically predicts the functional effect of the gene fusion (e.g. in-frame, out-of-frame, etc.).
  • Annotation is by default done only for canonical gene isoforms, but there is the option to annotate all gene non-canonical isoform combinations.
  • All gene and protein annotation is from Ensembl
  • Supports up to Ensembl release 95

Table of Contents

Examples

Basic Usage

You just need to provide the two fusion gene partners (gene symbol, Ensembl ID, or Entrez gene ID), their predicted fusion junctions in genomic coordinates, and the genome build. You can also specify certain transcripts with Ensembl transcript ID or RefSeq ID

Example usage from the command line:

agfusion annotate \
  --gene5prime DLG1 \
  --gene3prime BRAF \
  --junction5prime 31684294 \
  --junction3prime 39648486 \
  -db agfusion.mus_musculus.87.db \
  -o DLG1-BRAF

The protein domain structure of the DLG1-BRAF fusion:

alt tag

The exon structure of the DLG1-BRAF fusion:

alt tag

Plotting wild-type protein and exon structure

You can additionally plot the wild-type proteins and exon structures for each gene with --WT flag.

agfusion annotate \
   -g5 ENSMUSG00000022770 \
   -g3 ENSMUSG00000002413 \
   -j5 31684294 \
   -j3 39648486 \
   -db agfusion.mus_musculus.87.db \
   -o DLG1-BRAF \
   --WT

Canonical gene isoforms

By default AGFusion only plots the canonical gene isoforms, but you can tell AGFusion to include non-canonical isoform with the --noncanonical flag.

agfusion annotate \
  -g5 ENSMUSG00000022770 \
  -g3 ENSMUSG00000002413 \
  -j5 31684294 \
  -j3 39648486 \
  -db agfusion.mus_musculus.87.db \
  -o DLG1-BRAF \
  --noncanonical

Input from fusion-finding algorithms

You can provide as input output files from fusion-finding algorithms. Currently supported algorithms are:

  • Bellerophontes
  • BreakFusion
  • ChimeraScan
  • ChimeRScope
  • deFuse
  • EricScript
  • FusionCatcher
  • FusionHunter
  • FusionMap
  • InFusion
  • JAFFA
  • MapSplice (only if --gene-gtf specified)
  • STAR-Fusion
  • TopHat-Fusion

Below is an example for FusionCatcher.

agfusion batch \
  -f final-list_candidate-fusion-genes.txt \
  -a fusioncatcher \
  -o test \
  -db agfusion.mus_musculus.87.db

Graphical parameters

You can change domain names and colors:

agfusion annotate \
  -g5 ENSMUSG00000022770 \
  -g3 ENSMUSG00000002413 \
  -j5 31684294 \
  -j3 39648486 \
  -db agfusion.mus_musculus.87.db \
  -o DLG1-BRAF \
  --recolor "Pkinase_Tyr;red" --recolor "L27_1;blue" \
  --rename "Pkinase_Tyr;Kinase" --rename "L27_1;L27"

alt tag

You can rescale the protein length so that images of two different fusions have appropriate relative lengths when plotted side by side:

agfusion annotate \
  -g5 ENSMUSG00000022770 \
  -g3 ENSMUSG00000002413 \
  -j5 31684294 \
  -j3 39648486 \
  -db agfusion.mus_musculus.87.db \
  -o DLG1-BRAF \
  --recolor "Pkinase_Tyr;red" --recolor "L27_1;blue" \
  --rename "Pkinase_Tyr;Kinase" --rename "L27_1;L27" \
  --scale 2000
agfusion annotate \
  -g5 FGFR2 \
  -g3 DNM3 \
  -j5 130167703 \
  -j3 162019992 \
  -db agfusion.mus_musculus.87.db \
  -o FGFR2-DNM3 \
  --recolor "Pkinase_Tyr;red" \
  --rename "Pkinase_Tyr;Kinase" \
  --scale 2000

alt tag alt tag

Installation

First you need to install pyensembl (and the other dependencies listed at the bottom) and download the reference genome you will use by running one of the following.

For GRCh38/hg38:
pyensembl install --species homo_sapiens --release 87

For GRCh37/hg19:
pyensembl install --species homo_sapiens --release 75

For GRCm38/mm10:
pyensembl install --species mus_musculus --release 87

Then you can install AGFusion:

pip install agfusion

Finally, download the AGFusion database for your reference genome (downloaded from here).

For GRCh38/hg38:
agfusion download -g hg38

For GRCh37/hg19:
agfusion download -g hg19

For GRCm38/mm10:
agfusion download -g mm10

You can view all supported species and ensembl releases with agfusion download -a. Due to limitations in pyensembl, the maximum supported Ensembl release is 87.

Dependencies

  • python 2.7, 3.5
  • matplotlib>=1.5.0
  • pandas>=0.18.1
  • biopython>=1.67
  • future>=0.16.0
  • pyensembl>=1.1.0

Troubleshooting

Problem: I get a warning message like the following:

2017-08-28 15:02:51,377 - AGFusion - WARNING - No cDNA sequence available for AC073283.4! Will not print cDNA sequence for the AC073283.4-MSH2 fusion. You might be working with an outdated pyensembl. Update the package and rerun 'pyensembl install'

Solution: Run the following to update pyensembl package and database:

git clone [email protected]:hammerlab/pyensembl.git
cd pyensembl
sudo pip install .
pyensembl install --release (your-release) --species (your-species)

License

MIT license

Citing AGFusion

You can cite bioRxiv: http://dx.doi.org/10.1101/080903

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].