All Projects → KarchinLab → 2020plus

KarchinLab / 2020plus

Licence: Apache-2.0 License
Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to 2020plus

GenomicDataCommons
Provide R access to the NCI Genomic Data Commons portal.
Stars: ✭ 64 (+45.45%)
Mutual labels:  bioinformatics, cancer
SigProfilerExtractor
SigProfilerExtractor allows de novo extraction of mutational signatures from data generated in a matrix format. The tool identifies the number of operative mutational signatures, their activities in each sample, and the probability for each signature to cause a specific mutation type in a cancer sample. The tool makes use of SigProfilerMatrixGen…
Stars: ✭ 86 (+95.45%)
Mutual labels:  bioinformatics, somatic-variants
RNAseq titration results
Cross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously
Stars: ✭ 22 (-50%)
Mutual labels:  bioinformatics, cancer
fermikit
De novo assembly based variant calling pipeline for Illumina short reads
Stars: ✭ 98 (+122.73%)
Mutual labels:  bioinformatics
GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Stars: ✭ 19 (-56.82%)
Mutual labels:  bioinformatics
bioseq-js
For live demo, see http://lh3lh3.users.sourceforge.net/bioseq.shtml
Stars: ✭ 34 (-22.73%)
Mutual labels:  bioinformatics
sapporo
A standard implementation conforming to the Global Alliance for Genomics and Health (GA4GH) Workflow Execution Service (WES) API specification and a web application for managing and executing those WES services.
Stars: ✭ 17 (-61.36%)
Mutual labels:  bioinformatics
bacnet
BACNET is a Java based platform to develop website for multi-omics analysis
Stars: ✭ 12 (-72.73%)
Mutual labels:  bioinformatics
BioD
A D library for computational biology and bioinformatics
Stars: ✭ 45 (+2.27%)
Mutual labels:  bioinformatics
mitre
The Microbiome Interpretable Temporal Rule Engine
Stars: ✭ 37 (-15.91%)
Mutual labels:  bioinformatics
lexicon-mono-seq
DOM Text Based Multiple Sequence Alignment Library
Stars: ✭ 15 (-65.91%)
Mutual labels:  bioinformatics
obi
The Ontology for Biomedical Investigations
Stars: ✭ 49 (+11.36%)
Mutual labels:  bioinformatics
immunedb
ImmuneDB - A system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data
Stars: ✭ 13 (-70.45%)
Mutual labels:  bioinformatics
micca
micca - MICrobial Community Analysis
Stars: ✭ 19 (-56.82%)
Mutual labels:  bioinformatics
redbiom
Sample search by metadata and features
Stars: ✭ 27 (-38.64%)
Mutual labels:  bioinformatics
MINTIE
Method for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.
Stars: ✭ 24 (-45.45%)
Mutual labels:  cancer
linear-tree
A python library to build Model Trees with Linear Models at the leaves.
Stars: ✭ 128 (+190.91%)
Mutual labels:  random-forest
ctdna-pipeline
A simplified pipeline for ctDNA sequencing data analysis
Stars: ✭ 29 (-34.09%)
Mutual labels:  bioinformatics
MMseqs2-App
MMseqs2 app to run on your workstation or servers
Stars: ✭ 16 (-63.64%)
Mutual labels:  bioinformatics
docker-builds
📦 🐳 Dockerfiles and documentation on tools for public health bioinformatics
Stars: ✭ 84 (+90.91%)
Mutual labels:  bioinformatics

20/20+

About

Next-generation DNA sequencing of the exome has detected hundreds of thousands of small somatic variants (SSV) in cancer. However, distinguishing genes containing driving mutations rather than simply passenger SSVs from a cohort sequenced cancer samples requires sophisticated computational approaches. 20/20+ integrates many features indicative of positive selection to predict oncogenes and tumor suppressor genes from small somatic variants. The features capture mutational clustering, conservation, mutation in silico pathogenicity scores, mutation consequence types, protein interaction network connectivity, and other covariates (e.g. replication timing). Contrary to methods based on mutation rate, 20/20+ uses ratiometric features of mutations by normalizing for the total number of mutations in a gene. This decouples the genes from gene-level differences in background mutation rate. 20/20+ uses monte carlo simulations to evaluate the significance of random forest scores based on an estimated p-value from an empirical null distribution.

Documentation

Documentation Status

Please see the documentation on readthedocs.

Releases

You can download releases on github.

Installation

Build Status

20/20+ is designed to run on linux operating systems.

We recommend that you install the dependencies for 20/20+ through conda. Once conda is installed, setting up the environment is done as follows:

$ conda env create -f environment_python.yml  # install dependencies for python
$ source activate 2020plus  # activate the 20/20+ conda environment
$ conda install r r-randomForest rpy2  # install the R related dependencies

Every time you wish to run 20/20+, you will then need to activate the "2020plus" conda environment.

$ source activate 2020plus

The 20/20+ conda environment can also be deactivated.

$ source deactivate 2020plus
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].