All Projects → shenwanxiang → bidd-molmap

shenwanxiang / bidd-molmap

Licence: other
MolMap: An Efficient Convolutional Neural Network Based Molecular Deep Learning Tool

Programming Languages

HTML
75241 projects
Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to bidd-molmap

Rcpi
Molecular informatics toolkit with a comprehensive integration of bioinformatics and cheminformatics tools for drug discovery.
Stars: ✭ 22 (-78.43%)
Mutual labels:  fingerprint, drug-discovery
dhash-vips
vips-powered ruby gem to measure images similarity, implementing dHash and IDHash algorithms
Stars: ✭ 75 (-26.47%)
Mutual labels:  fingerprint
Fingerlock
Android fingerprint authentication library
Stars: ✭ 203 (+99.02%)
Mutual labels:  fingerprint
pinot
Probabilistic Inference for NOvel Therapeutics
Stars: ✭ 15 (-85.29%)
Mutual labels:  drug-discovery
Wafw00f
WAFW00F allows one to identify and fingerprint Web Application Firewall (WAF) products protecting a website.
Stars: ✭ 2,983 (+2824.51%)
Mutual labels:  fingerprint
fingerprintjs-pro-ios
Official iOS/tvOS agent & SDK for accurate device identification, created for the Fingerprint Pro identification API.
Stars: ✭ 35 (-65.69%)
Mutual labels:  fingerprint
Online Privacy Test Resource List
Privacy Online Test and Resource Compendium (POTARC) 🕵🏻
Stars: ✭ 185 (+81.37%)
Mutual labels:  fingerprint
CycleTLS
Spoof TLS/JA3 fingerprints in GO and Javascript
Stars: ✭ 362 (+254.9%)
Mutual labels:  fingerprint
Jupyter Dock
Jupyter Dock is a set of Jupyter Notebooks for performing molecular docking protocols interactively, as well as visualizing, converting file formats and analyzing the results.
Stars: ✭ 179 (+75.49%)
Mutual labels:  drug-discovery
Fingerprint
Android library that simplifies the process of fingerprint authentications.
Stars: ✭ 68 (-33.33%)
Mutual labels:  fingerprint
graph-neural-networks-for-drug-discovery
odr.chalmers.se/handle/20.500.12380/256629?locale=en
Stars: ✭ 78 (-23.53%)
Mutual labels:  drug-discovery
Cmsprint
CMS和中间件指纹库
Stars: ✭ 232 (+127.45%)
Mutual labels:  fingerprint
node-dvbtee
MPEG2 transport stream parser for Node.js with support for television broadcast PSIP tables and descriptors
Stars: ✭ 24 (-76.47%)
Mutual labels:  descriptors
Cordova Plugin Touch Id
💅 👱‍♂️ Forget passwords, use a fingerprint scanner!
Stars: ✭ 209 (+104.9%)
Mutual labels:  fingerprint
chemprop
Fast and scalable uncertainty quantification for neural molecular property prediction, accelerated optimization, and guided virtual screening.
Stars: ✭ 75 (-26.47%)
Mutual labels:  drug-discovery
Fingerprintjs
Browser fingerprinting library with the highest accuracy and stability.
Stars: ✭ 15,481 (+15077.45%)
Mutual labels:  fingerprint
Cordova Plugin Fingerprint Aio
👆 📱 Cordova Plugin for fingerprint sensors (and FaceID) with Android and iOS support
Stars: ✭ 236 (+131.37%)
Mutual labels:  fingerprint
MolDQN-pytorch
A PyTorch Implementation of "Optimization of Molecules via Deep Reinforcement Learning".
Stars: ✭ 58 (-43.14%)
Mutual labels:  drug-discovery
NDD
Drug-Drug Interaction Predicting by Neural Network Using Integrated Similarity
Stars: ✭ 25 (-75.49%)
Mutual labels:  drug-discovery
duff
Pure OCaml implementation of libXdiff (Rabin's fingerprint)
Stars: ✭ 20 (-80.39%)
Mutual labels:  fingerprint

License: MIT Documentation Status Build Status DOI Codeocean Paper PyPI version

MolMap

MolMap is generated by the following steps:

  • Step1: Input structures
  • Step2: Feature extraction
  • Step3: Feature pairwise distance calculation --> cosine, correlation, jaccard
  • Step4: Feature 2D embedding --> umap, tsne, mds
  • Step5: Feature grid arrangement --> grid, scatter
  • Step5: Transform --> minmax, standard

MolMap Fmaps for compounds

fmap_dynamicly

Construction of the MolMap Objects


molmap

The MolMapNet Architecture


net

Installation


  1. install rdkit and tamp first(create a molmap env):
conda create -c conda-forge -n molmap rdkit python=3.7
conda activate molmap
conda install -c tmap tmap
pip install molmap
  1. ChemBench (optional, if you wish to use the dataset and the split induces in this paper).

  2. If you have gcc problems when you install molmap, please installing g++ first:

sudo apt-get install g++

Out-of-the-Box Usage


code

import molmap
# Define your molmap
mp_name = './descriptor.mp'
mp = molmap.MolMap(ftype = 'descriptor', fmap_type = 'grid',
                   split_channels = True,   metric='cosine', var_thr=1e-4)
# Fit your molmap
mp.fit(method = 'umap', verbose = 2)
mp.save(mp_name) 
# Visulization of your molmap
mp.plot_scatter()
mp.plot_grid()
# Batch transform 
from molmap import dataset
data = dataset.load_ESOL()
smiles_list = data.x # list of smiles strings
X = mp.batch_transform(smiles_list,  scale = True, 
                       scale_method = 'minmax', n_jobs=8)
Y = data.y 
print(X.shape)
# Train on your data and test on the external test set
from molmap.model import RegressionEstimator
from sklearn.utils import shuffle 
import numpy as np
import pandas as pd
def Rdsplit(df, random_state = 888, split_size = [0.8, 0.1, 0.1]):
    base_indices = np.arange(len(df)) 
    base_indices = shuffle(base_indices, random_state = random_state) 
    nb_test = int(len(base_indices) * split_size[2]) 
    nb_val = int(len(base_indices) * split_size[1]) 
    test_idx = base_indices[0:nb_test] 
    valid_idx = base_indices[(nb_test):(nb_test+nb_val)] 
    train_idx = base_indices[(nb_test+nb_val):len(base_indices)] 
    print(len(train_idx), len(valid_idx), len(test_idx)) 
    return train_idx, valid_idx, test_idx 
# split your data
train_idx, valid_idx, test_idx = Rdsplit(data.x, random_state = 888)
trainX = X[train_idx]
trainY = Y[train_idx]
validX = X[valid_idx]
validY = Y[valid_idx]
testX = X[test_idx]
testY = Y[test_idx]

# fit your model
clf = RegressionEstimator(n_outputs=trainY.shape[1], 
                          fmap_shape1 = trainX.shape[1:], 
                          dense_layers = [128, 64], gpuid = 0) 
clf.fit(trainX, trainY, validX, validY)

# make prediction
testY_pred = clf.predict(testX)
rmse, r2 = clf._performance.evaluate(testX, testY)
print(rmse, r2)

Out-of-the-Box Performances


Dataset Task Metric MoleculeNet (GCN Best Model) Chemprop (D-MPNN model) MolMapNet (MMNB model)
ESOL RMSE 0.580 (MPNN) 0.555 0.575
FreeSolv RMSE 1.150 (MPNN) 1.075 1.155
Lipop RMSE 0.655 (GC) 0.555 0.625
PDBbind-F RMSE 1.440 (GC) 1.391 0.721
PDBbind-C RMSE 1.920 (GC) 2.173 0.931
PDBbind-R RMSE 1.650 (GC) 1.486 0.889
BACE ROC_AUC 0.806 (Weave) N.A. 0.849
HIV ROC_AUC 0.763 (GC) 0.776 0.777
PCBA PRC_AUC 0.136 (GC) 0.335 0.276
MUV PRC_AUC 0.109 (Weave) 0.041 0.096
ChEMBL ROC_AUC N.A. 0.739 0.750
Tox21 ROC_AUC 0.829 (GC) 0.851 0.845
SIDER ROC_AUC 0.638 (GC) 0.676 0.68
ClinTox ROC_AUC 0.832 (GC) 0.864 0.888
BBBP ROC_AUC 0.690 (Weave) 0.738 0.739
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].