Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → aqlaboratory → Rgn

aqlaboratory / Rgn

Licence: mit

Recurrent Geometric Networks for end-to-end differentiable learning of protein structure

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning deep-neural-networks

Projects that are alternatives of or similar to Rgn

Dlpython course

Примеры для курса "Программирование глубоких нейронных сетей на Python"

Stars: ✭ 266 (-11.92%)

Mutual labels: deep-neural-networks

A fast Clojure Tensor & Deep Learning library

Stars: ✭ 288 (-4.64%)

Mutual labels: deep-neural-networks

Deep Learning Uncertainty

Literature survey, paper reviews, experimental setups and a collection of implementations for baselines methods for predictive uncertainty estimation in deep learning models.

Stars: ✭ 296 (-1.99%)

Mutual labels: deep-neural-networks

RAD: Reinforcement Learning with Augmented Data

Stars: ✭ 268 (-11.26%)

Mutual labels: deep-neural-networks

PAddle PARAllel text-to-speech toolKIT (supporting WaveFlow, WaveNet, Transformer TTS and Tacotron2)

Stars: ✭ 279 (-7.62%)

Mutual labels: deep-neural-networks

Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ

Stars: ✭ 294 (-2.65%)

Mutual labels: deep-neural-networks

Learning to Cluster. A deep clustering strategy.

Stars: ✭ 262 (-13.25%)

Mutual labels: deep-neural-networks

Yolo V2 Pytorch

YOLO for object detection tasks

Stars: ✭ 302 (+0%)

Mutual labels: deep-neural-networks

Transfer learning for time series classification

Stars: ✭ 284 (-5.96%)

Mutual labels: deep-neural-networks

Source code for the MICCAI 2016 Paper "Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional NeuralNetworks and 3D Conditional Random Fields"

Stars: ✭ 296 (-1.99%)

Mutual labels: deep-neural-networks

Bmw Tensorflow Inference Api Gpu

This is a repository for an object detection inference API using the Tensorflow framework.

Stars: ✭ 277 (-8.28%)

Mutual labels: deep-neural-networks

Awesome Distributed Deep Learning

A curated list of awesome Distributed Deep Learning resources.

Stars: ✭ 277 (-8.28%)

Mutual labels: deep-neural-networks

Adversarial Examples Pytorch

Implementation of Papers on Adversarial Examples

Stars: ✭ 293 (-2.98%)

Mutual labels: deep-neural-networks

Twitter Sent Dnn

Deep Neural Network for Sentiment Analysis on Twitter

Stars: ✭ 270 (-10.6%)

Mutual labels: deep-neural-networks

Deep Reinforcement Learning library for humans

Stars: ✭ 298 (-1.32%)

Mutual labels: deep-neural-networks

Awesome Speech Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

Stars: ✭ 257 (-14.9%)

Mutual labels: deep-neural-networks

deep learning based speech enhancement using keras or pytorch, make it easy to use

Stars: ✭ 288 (-4.64%)

Mutual labels: deep-neural-networks

Attention is all you need

Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.

Stars: ✭ 303 (+0.33%)

Mutual labels: deep-neural-networks

Awesome Deep Vision Web Demo

A curated list of awesome deep vision web demo

Stars: ✭ 298 (-1.32%)

Mutual labels: deep-neural-networks

Model Compression Papers

Papers for deep neural network compression and acceleration

Stars: ✭ 296 (-1.99%)

Mutual labels: deep-neural-networks

View All Similar Projects ➔

Recurrent Geometric Networks

This is the reference (TensorFlow) implementation of recurrent geometric networks (RGNs), described in the paper End-to-end differentiable learning of protein structure.

Installation and requirements

Extract all files in the model directory in a single location and use protling.py, described further below, to train new models and predict structures. Below are the language requirements and package dependencies:

Python 2.7
TensorFlow >= 1.4 (tested up to 1.12)
setproctitle

Usage

The protling.py script facilities training of and prediction using RGN models. Below are typical use cases. The script also accepts a number of command-line options whose functionality can be queried using the --help option.

Train a new model or continue training an existing model

RGN models are described using a configuration file that controls hyperparameters and architectural choices. For a list of available options and their descriptions, see its documentation. Once a configuration file has been created, along with a suitable dataset (download a ready-made ProteinNet data set or create a new one from scratch using the convert_to_tfrecord.py script), the following directory structure must be created:

<baseDirectory>/runs/<runName>/<datasetName>/<configurationFile>
<baseDirectory>/data/<datasetName>/[training,validation,testing]

Where the first path points to the configuration file and the second path to the directories containing the training, validation, and possibly test sets. Note that <runName> and <datasetName> are user-defined variables specified in the configuration file that encode the name of the model and dataset, respectively.

Training of a new model can then be invoked by calling:

python protling.py <configurationFilePath> -d <baseDirectory>

Download a pre-trained model for an example of a correctly defined directory structure. Note that ProteinNet training sets come in multiple "thinnings" and only one should be used at a time by placing it in the main training directory.

To resume training an existing model, run the command above for a previously trained model with saved checkpoints.

Predict sequences in ProteinNet TFRecords format using a trained model

To predict the structures of proteins already in ProteinNet TFRecord format using an existing model with a saved checkpoint, call:

python protling.py <configFilePath> -d <baseDirectory> -p -g0

This predicts the structures of the dataset specified in the configuration file. By default only the validation set is predicted, but this can be changed using the -e option, e.g. -e weighted_testing to predict the test set. The -g0 option sets the GPU to be used to the one with index 0. If a different GPU is available change the setting appropriately.

Predict structure of a single new sequence using a trained model

If all you have is a single sequence for which you wish to make a prediction, there are multiple steps that must be performed. First, a PSSM needs to be created by running JackHMMer (or a similar tool) against a sequence database, the resulting PSSM must be combined with the sequence in a ProteinNet record, and the file must be converted to the TFRecord format. Predictions can then be made as previously described.

Below is an example of how to do this using the supplied scripts (in data_processing) and one of the pre-trained models, assumed to be unzipped in <baseDirectory>. HMMER must also be installed. The raw sequence databases (<fastaDatabase>) used in building PSSMs can be obtained from here. The script below assumes that <sequenceFile> only contains a single sequence in the FASTA file format.

jackhmmer.sh <sequenceFile> <fastaDatabase>
python convert_to_proteinnet.py <sequenceFile>
python convert_to_tfrecord.py <sequenceFile>.proteinnet <sequenceFile>.tfrecord 42
cp <sequenceFile>.tfrecord <baseDirectory>/data/<datasetName>/testing/
python protling.py <baseDirectory>/runs/<runName>/<datasetName>/<configurationFile> -d <baseDirectory> -p -e weighted_testing -g0

The first line searches the supplied database for matches to the supplied sequence and extracts a PSSM out of the results. It will generate multiple new files. These are then used in the second line to construct a text-based ProteinNet file (with 42 entries per evolutionary profile, compatible with the pre-trained RGN models). The third line converts the file to TFRecords format, and the fourth line copies the file to the testing directory of a pre-trained model. Finally the fifth line predicts the structure using the pre-trained RGN model. The outputs will be placed in <baseDirectory>/runs/<runName>/<datasetName>/<latestIterationNumber>/outputsTesting/ and will be comprised of two files: a .tertiary file which contains the atomic coordinates, and .recurrent_states file which contains the RGN latent representation of the sequence. The -g0 option sets the GPU to be used to the one with index 0. If a different GPU is available change the setting appropriately.

Pre-trained models

Below we make available pre-trained RGN models using the ProteinNet 7 - 12 datasets as checkpointed TF graphs. These models are identical to the ones used in reporting results in the Cell Systems paper, except for the CASP 11 model which is slightly different due to using a newer codebase.

CASP7	CASP8	CASP9	CASP10	CASP11	CASP12

To train new models from scratch using the same hyperparameter choices as the above models, use the appropriate configuration file from here.

PyTorch implementation

The reference RGN implementation is currently only available in TensorFlow, however the OpenProtein project implements various aspects of the RGN model in PyTorch, and PyTorch-RGN is a work-in-progress implementation of the RGN model.

Reference

End-to-end differentiable learning of protein structure, Cell Systems 2019

Funding

This work was supported by NIGMS grant P50GM107618 and NCI grant U54-CA225088.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 302

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (17) 🔗