All Projects → simonfqy → PADME

simonfqy / PADME

Licence: MIT license
This is the repository containing the source code for my Master's thesis research, about predicting drug-target interaction using deep learning.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to PADME

Indigo
Universal cheminformatics libraries, utilities and database search tools
Stars: ✭ 146 (+342.42%)
Mutual labels:  cheminformatics
Awesome Cheminformatics
A curated list of Cheminformatics libraries and software.
Stars: ✭ 244 (+639.39%)
Mutual labels:  cheminformatics
molecular-VAE
Implementation of the paper - Automatic chemical design using a data-driven continuous representation of molecules
Stars: ✭ 36 (+9.09%)
Mutual labels:  cheminformatics
Chembl webresource client
Official Python client for accessing ChEMBL API.
Stars: ✭ 165 (+400%)
Mutual labels:  cheminformatics
Dgl Lifesci
Python package for graph neural networks in chemistry and biology
Stars: ✭ 194 (+487.88%)
Mutual labels:  cheminformatics
chembience
A Docker-based, cloudable platform for the development of chemoinformatics-centric web applications and microservices.
Stars: ✭ 41 (+24.24%)
Mutual labels:  cheminformatics
Py4chemoinformatics
Python for chemoinformatics
Stars: ✭ 140 (+324.24%)
Mutual labels:  cheminformatics
homebrew-cheminformatics
Cheminformatics formulae for the Homebrew package manager
Stars: ✭ 19 (-42.42%)
Mutual labels:  cheminformatics
Release
Deep Reinforcement Learning for de-novo Drug Design
Stars: ✭ 201 (+509.09%)
Mutual labels:  cheminformatics
IUPAC SMILES plus
IUPAC SMILES+ Specification
Stars: ✭ 25 (-24.24%)
Mutual labels:  cheminformatics
Pubchempy
Python wrapper for the PubChem PUG REST API.
Stars: ✭ 171 (+418.18%)
Mutual labels:  cheminformatics
Oddt
Open Drug Discovery Toolkit
Stars: ✭ 186 (+463.64%)
Mutual labels:  cheminformatics
Jupyter Dock
Jupyter Dock is a set of Jupyter Notebooks for performing molecular docking protocols interactively, as well as visualizing, converting file formats and analyzing the results.
Stars: ✭ 179 (+442.42%)
Mutual labels:  cheminformatics
Kekule.js
A Javascript cheminformatics toolkit.
Stars: ✭ 156 (+372.73%)
Mutual labels:  cheminformatics
galaxytools
🔬📚 Galaxy Tool wrappers
Stars: ✭ 106 (+221.21%)
Mutual labels:  cheminformatics
Ase ani
ANI-1 neural net potential with python interface (ASE)
Stars: ✭ 145 (+339.39%)
Mutual labels:  cheminformatics
senpai
Molecular dynamics simulation software
Stars: ✭ 124 (+275.76%)
Mutual labels:  cheminformatics
DrugEx
Deep learning toolkit for Drug Design with Pareto-based Multi-Objective optimization in Polypharmacology
Stars: ✭ 128 (+287.88%)
Mutual labels:  cheminformatics
Version3
Version 3 of Chem4Word - A Chemistry Add-In for Microsoft Word
Stars: ✭ 53 (+60.61%)
Mutual labels:  cheminformatics
organic-chemistry-reaction-prediction-using-NMT
organic chemistry reaction prediction using NMT with Attention
Stars: ✭ 30 (-9.09%)
Mutual labels:  cheminformatics

PADME: A Deep Learning-Based framework for Drug-Target Interaction Prediction

This is the repository containing the source code for my Master's thesis research, namely predicting drug-target interaction using Deep Neural Networks. The name PADME stands for "Protein And Drug Molecule interaction prEdiction", which also happened to be the heroine of Star Wars Prequel Trilogy. The paper can be found here: https://arxiv.org/abs/1807.09741

It currently depends on a version of DeepChem Python package released in November 2017. I will need to make major modifications to it such that it would be compatible with the current version of DeepChem after I am done with my first version of the current paper. The dcCustom folder is a package, inheriting some classes from DeepChem. Some of the implementations are customized, so I named it dcCustom, which means "Customized version of DeepChem".

The Python script driver.py at the top level is in charge of calling functions in dcCustom to execute the program. I assume using a Linux system, the .sh files call driver.py, each .sh file starts with the word drive, and specifies the different options that should be passed to the program. The options would include a dataset to be analyzed, model to be used, whether cross validation should be performed, etc. Like DeepChem, PADME cannot use multiple GPUs to parallelize the task, so using one GPU for one process is the most efficient choice, otherwise extra GPUs would have their memory completely occupied but not doing any useful work, only 1 GPU is the workhorse. For this purpose, CUDA_VISIBLE_DEVICES was specified in each .sh file, such that we can take advantage of multiple GPUs, each one running a specific process. To run the program, simply type the path to the corresponding shell script in the command line in Linux.

The protein descriptors used is PSC, Protein Sequence Composition descriptor, which are stored as files in the respective dataset folders, like /full_toxcast. You can specify the path of the protein sequence descriptor files in the .sh scripts.

Currently it works fine for graphconvreg, weave_regression, tf_regression, and mpnn. I will need updates to the classification models so that it would work correctly for them too, like weave, graphconv, etc.

You must first have DeepChem installed for PADME to work correctly, which in turn requires you to install TensorFlow.

Other folders like oldCode and phase1 are not related to PADME, they are for the first phase of my project. You can neglect them.

Built with

Python - Process data and constructing deep learning model

Author

simonfqy (Qingyuan Feng)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].