All Projects → pcko1 → Deep-Drug-Coder

pcko1 / Deep-Drug-Coder

Licence: MIT license
A tensorflow.keras generative neural network for de novo drug design, first-authored in Nature Machine Intelligence while working at AstraZeneca.

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Deep-Drug-Coder

chemicalx
A PyTorch and TorchDrug based deep learning library for drug pair scoring.
Stars: ✭ 176 (+23.08%)
Mutual labels:  drug-discovery, smiles-strings
pysmiles
A lightweight python-only library for reading and writing SMILES strings
Stars: ✭ 95 (-33.57%)
Mutual labels:  smiles-strings
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (-74.13%)
Mutual labels:  lstm-neural-networks
ReinventCommunity
No description or website provided.
Stars: ✭ 103 (-27.97%)
Mutual labels:  denovo-design
lstm-numpy
Vanilla LSTM with numpy
Stars: ✭ 17 (-88.11%)
Mutual labels:  lstm-neural-networks
Audio Classification using LSTM
Classification of Urban Sound Audio Dataset using LSTM-based model.
Stars: ✭ 47 (-67.13%)
Mutual labels:  lstm-neural-networks
bidd-molmap
MolMap: An Efficient Convolutional Neural Network Based Molecular Deep Learning Tool
Stars: ✭ 102 (-28.67%)
Mutual labels:  drug-discovery
screenlamp
screenlamp is a Python toolkit for hypothesis-driven virtual screening
Stars: ✭ 20 (-86.01%)
Mutual labels:  drug-discovery
GLaDOS
Web Interface for ChEMBL @ EMBL-EBI
Stars: ✭ 28 (-80.42%)
Mutual labels:  drug-discovery
DeepLearning-PadhAI
All the code files related to the deep learning course from PadhAI
Stars: ✭ 88 (-38.46%)
Mutual labels:  lstm-neural-networks
py4chemoinformatics
Python for chemoinformatics
Stars: ✭ 78 (-45.45%)
Mutual labels:  drug-discovery
Hierarchical-attention-network
My implementation of "Hierarchical Attention Networks for Document Classification" in Keras
Stars: ✭ 26 (-81.82%)
Mutual labels:  lstm-neural-networks
bitcoin-prediction
bitcoin prediction algorithms
Stars: ✭ 21 (-85.31%)
Mutual labels:  lstm-neural-networks
contextualLSTM
Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning
Stars: ✭ 28 (-80.42%)
Mutual labels:  lstm-neural-networks
Predict-next-word
An LSTM example using tensorflow to predict the next word in a text
Stars: ✭ 37 (-74.13%)
Mutual labels:  lstm-neural-networks
Quantifying-ESG-Alpha-using-Scholar-Big-Data-ICAIF-2020
Quantifying ESG Alpha using Scholar Big Data: An Automated Machine Learning Approach.
Stars: ✭ 42 (-70.63%)
Mutual labels:  lstm-neural-networks
amr
Official adversarial mixup resynthesis repository
Stars: ✭ 31 (-78.32%)
Mutual labels:  autoencoders
ImagesSequencesPredictions
雷达回波外推,ConvLSTM,训练模型并外推。
Stars: ✭ 59 (-58.74%)
Mutual labels:  lstm-neural-networks
gnn-lspe
Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations), ICLR 2022
Stars: ✭ 165 (+15.38%)
Mutual labels:  molecules
continuous Bernoulli
There are C language computer programs about the simulator, transformation, and test statistic of continuous Bernoulli distribution. More than that, the book contains continuous Binomial distribution and continuous Trinomial distribution.
Stars: ✭ 22 (-84.62%)
Mutual labels:  autoencoders

DeepDrugCoder (DDC): Heteroencoder for molecular encoding and de novo generation

Python 3.6 License: GPL v3 DOI

NOTE: The code now only supports tensorflow-gpu >= 2.0.


Code for the purposes of Direct Steering of de novo Molecular Generation using Descriptor Conditional Recurrent Neural Networks (cRNNs).

Cheers if you were brought here by this blog post. If not, give it a read :)


Deep learning has acquired considerable momentum over the past couple of years in the domain of de-novo drug design. Particularly, transfer and reinforcement learning have demonstrated the capability of steering the generative process towards chemical regions of interest. In this work, we propose a simple approach to the focused generative task by constructing a conditional recurrent neural network (cRNN). For this purpose, we aggregate selected molecular descriptors along with a QSAR-based bioactivity label and transform them into initial LSTM states before starting the generation of SMILES strings that are focused towards the aspired properties. We thus tackle the inverse QSAR problem directly by training on molecular descriptors, instead of iteratively optimizing around a set of candidate molecules. The trained cRNNs are able to generate molecules near multiple specified conditions, while maintaining an output that is more focused than traditional RNNs yet less focused than autoencoders. The method shows promise for applications in both scaffold hoping and ligand series generation, depending on whether the cRNN is trained on calculated scalar molecular properties or structural fingerprints. This also demonstrates that fingerprint-to-molecule decoding is feasible, leading to molecules that are similar – if not identical – to the ones the fingerprints originated from. Additionally, the cRNN is able to generate a larger fraction of predicted active compounds against the DRD2 receptor when compared to an RNN trained with the transfer learning model.

Only a GPU version of the model is supported. You need access to a GPU to use it.

Please refer to the demo notebooks for usage details.

Figure from manuscript


Custom Dependencies

Installation

  • Install git-lfs as instructed here. This is necessary in order to download the datasets.
  • Clone the repo and navigate to it.
  • Create a predefined Python3.6 conda environment by conda env create -f env/ddc_env.yml. This ensures that you have the correct version of rdKit and cudatoolkit.
  • Run pip install . to install remaining dependencies and add the package to the Python path.
  • Add the environment in the drop-down list of jupyter by python -m ipykernel install --user --name ddc_env --display-name "ddc_env (python_3.6.7)".

Usage

conda activate ddc_env
from ddc_pub import ddc_v3 as ddc

Methods

  • fit(): Fit a DDC model to the dataset.
  • vectorize(): Convert a binary RDKit molecule to its One-Hot-Encoded representation.
  • transform(): Encode a vectorized molecule to its latent representation.
  • predict(): Decode a latent representation into a SMILES string and return its Negative Log Likelihood (NLL).
  • predict_batch(): Decode a list of latent representations into SMILES strings and return their NLLs.
  • get_smiles_nll(): Back-calculate the NLL of a known SMILES string, if it was to be sampled by the biased decoder.
  • get_smiles_nll_batch(): Back-calculate the NLLs of a batch of known SMILES strings, if they were to be sampled by the biased decoder.
  • summary(): Display essential architectural parameters.
  • get_graphs(): Export model graphs to .png files using pydot and graphviz (might fail).
  • save(): Save the model in a .zip directory.

Issues

Please report all installation / usage issues by opening an issue at this repo.

Cite

Kotsias, P.-C. et al. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2, (2020)

or in bibtex:

@article{Kotsias2020,
  isbn = {4225602001},
  issn = {2522-5839},
  journal = {Nature Machine Intelligence},
  number = {May},
  publisher = {Springer US},
  title = {Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks},
  url = {https://doi.org/10.1038/s42256-020-0174-5},
  volume = {2},
  year = {2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].