All Projects → dmis-lab → Resimnet

dmis-lab / Resimnet

Licence: apache-2.0
Implementation of ReSimNet for drug response similarity prediction

Projects that are alternatives of or similar to Resimnet

Oxford Deepnlp 2017
🚀 🎉 ✨ Oxford Deep NLP 2017 Course Materials and Practicals, Solutions
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Alfabattle2 1stproblem
Alfabattle 2.0 1st task Top-6 solution: 8-folds lgbm blend
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Mask Rcnn Tensorflow
Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the original implementation on AWS.
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Taxi
TAXI: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Textclassifier
tensorflow implementation
Stars: ✭ 944 (+3271.43%)
Mutual labels:  jupyter-notebook
Pacmap
PaCMAP: Large-scale Dimension Reduction Technique Preserving Both Global and Local Structure
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Ucsandiegox Dse200x Python For Data Science
UCSandDiego Micro Masters Program
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Sports Type Classifier
Classify the type of sports from images
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Kispython
Курс программирования на языке Python
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Data driven science python demos
IPython notebooks with demo code intended as a companion to the book "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by J. Nathan Kutz and Steven L. Brunton
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Sgdoptim.jl
A julia package for Gradient Descent and Stochastic Gradient Descent
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Anatomyofmatplotlib
Anatomy of Matplotlib -- tutorial developed for the SciPy conference
Stars: ✭ 943 (+3267.86%)
Mutual labels:  jupyter-notebook
Advanced Gradient Obfuscating
Take further steps in the arms race of adversarial examples with only preprocessing.
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Odsc east 2016
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Chexpert
CheXpert competition models -- attention augmented convolutions on DenseNet, ResNet; EfficientNet
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Conjugate Gradient
Painless conjugate gradient notebooks
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Sid
Official implementation for ICCV19 "Shadow Removal via Shadow Image Decomposition"
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Shadowmusic
A temporal music synthesizer
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Medium Article
Repo for articles in my personal blog and Medium
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Idb Idb Invest Coronavirus Impact Dashboard
Follow the impact of COVID-19 outbreak in Latin America in real time
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook

ReSimNet

A Pytorch Implementation of paper

ReSimNet: Drug Response Similarity Prediction using Siamese Neural Networks
Jeon and Park et al., 2018

Abstract

Traditional drug discovery approaches identify a target for a disease and find a compound that binds to the target. In this approach, structures of compounds are considered as the most important features because it is assumed that similar structures will bind to the same target. Therefore, structural analogs of the drugs that bind to the target are selected as drug candidates. However, even though compounds are not structural analogs, they may achieve the desired response. A new drug discovery method based on drug response, which can complement the structure-based methods, is needed.

We implemented Siamese neural networks called ReSimNet that take as input two chemical compounds and predicts the CMap score of the two compounds, which we use to measure the transcriptional response similarity of the two counpounds. ReSimNet learns the embedding vector of a chemical compound in a transcriptional response space. ReSimNet is trained to minimize the difference between the cosine similarity of the embedding vectors of the two compounds and the CMap score of the two compounds. ReSimNet can find pairs of compounds that are similar in response even though they may have dissimilar structures. In our quantitative evaluation, ReSimNet outperformed the baseline machine learning models. The ReSimNet ensemble model achieves a Pearson correlation of 0.518 and a [email protected]% of 0.989. In addition, in the qualitative analysis, we tested ReSimNet on the ZINC15 database and showed that ReSimNet successfully identifies chemical compounds that are relevant to a prototype drug whose mechanism of action is known.

Pipeline

Full Pipeline

Requirements

Git Clone & Initial Setting

Clone our source codes and make folders to save data you need.

# clone the source code on your directory
$ git clone https://github.com/jhyuklee/ReSimNet.git
$ cd ReSimNet

# make folder to save and load your data
$ cd tasks
$ mkdir -p data

# make folder to save and load your model
cd ../../..
$ mkdir -p results

Download Files You Need to Run ReSimNet

Dataset for Training

Pre-Trained Models

All 10 Models for Ensemble

Example Input Pairs

  • examples.csv (244byte)
    Save this file to ./ReSimNet/tasks/data/pairs/examples.csv

Click the link ""Download the FingerPrint Respresentation"".

Training the ReSimNet

# Train for new model.
$ bash train.sh

# Train for the new ensemble models.
$ bast train_ensemble.sh

CMap Score Prediction using ReSimNet

For your own fingerprint pairs, ReSimNet provides a predicted CMap score for each pair. Running download.sh and predict.sh will first download pretrained ReSimNet with sample datasets, and save a result file for predicted CMap scores.

# Save scores of sample pair data
$ bash predict_example.sh

Input Fingerprint pair file must be a .csv file in which every row consists of two columns denoting two Fingerprints of each pair. Please, place files under './tasks/data/pairs/'.

# Sample Fingerprints (./tasks/data/pairs/examples.csv)
id1,id2
BRD-K43164539,BRD-A45333398
BRD-K83289131,BRD-K82484965
BRD-K06817181,BRD-A41112154
BRD-K06817181,BRD-K67977190
BRD-K06817181,BRD-A87125127
BRD-K68095457,BRD-K38903228
BRD-K68095457,BRD-K01902415
BRD-K68095457,BRD-K06817181

Predicted CMap scores will be saved at each row of a file './results/input-pair-file.model-name.csv'.

# Sample results (./results/examples.csv.ReSimNet7.csv')
prediction
0.9146181344985962
0.9301251173019409
0.8519644737243652
0.9631381034851074
0.7272981405258179

CMap Score Prediction of ZINC using ReSimNet

# Save scores of sample pair data
$ bash predict_zinc.sh

Click the link ""Download the ZINC files"".

  • zinc-test.zip (8KB)
    Save this file to ./ReSimNet/tasks/data/pairs_zinc/zinc-test.zip and unzip.
# Sample Zinc files (./tasks/data/pairs_zinc/zinc-test/AACA.csv)
,smiles,zinc_id,inchikey,mwt,logp,reactive,purchasable,tranche_name,features,fingerprint
17,CC1NNC(=S)NN1,ZINC000018204142,BYIXAEICDPEBOP-UHFFFAOYSA-N,132.192,-1.181,10,50,AACA,,00000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Click the link ""Download the example pairings"".

  • example_drugs.csv (7KB)
    Save this file to ./ReSimNet/tasks/data/pairs_zinc/example_drugs.csv
# Sample example files (./tasks/data/pairs_zinc/example_drugs.csv)
pair,fp
ZINC18279871,00000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000100000000000000000000000000000000000000000000000000100000000000010010000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000010000000000000000000000000000000000000000000000000000000000000000000000100000000000100000000000000000000000000000000000010000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000001000000000000000000000000000000000000010000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000010000000001000000000000000000000000000000000000000000000001000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000001000000000000000001000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000100000000000001000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100001000000000000000000000000000001001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
ZINC3938668,00000100000000000000000000000100000000000000000000000000000000000000000000100000100000001000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000100010000010000000000000000000000000000000000000000000000000000000000000100001001000000000000000000000000000101000010000000010000000000000000000000000000000001000000000000000000000000000000001000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000010000000000000000000000001000100000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000101000000000100000000001000000000000000000000000000000000000010000010000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000100000000000000000100000000100000000000000010000100000000000000000100000000000000000000000000000100000000000000100000000100000000001000000000000000001001000000000000000000000000000100000001000000000000000001010000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000001000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000010000000000000000000000100000000000000000010100000000000000000000000000000000000000000000000000010001000000100000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000001000000001000000010000000010000000000000000000000000000000000000010000000000000000000000100001000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000011000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000100000000000000010000000000000000000000000000000000010000000000000

Predicted CMap scores will be saved at each row of a file './results/input-pair-file.model-name.csv'.

# Sample results (./results/AACA.csv.ReSimNet7.csv')
pair1,pair2,prediction
ZINC000018204142,ZINC18279871,0.90729403
ZINC000018204142,ZINC3938668,0.91043824

Liscense

Apache License 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].