All Projects → malllabiisc → ProteinGCN

malllabiisc / ProteinGCN

Licence: Apache-2.0 License
ProteinGCN: Protein model quality assessment using Graph Convolutional Networks

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to ProteinGCN

Pytorch geometric
Graph Neural Network Library for PyTorch
Stars: ✭ 13,359 (+15080.68%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
awesome-efficient-gnn
Code and resources on scalable and efficient Graph Neural Networks
Stars: ✭ 498 (+465.91%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Stellargraph
StellarGraph - Machine Learning on Graphs
Stars: ✭ 2,235 (+2439.77%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Euler
A distributed graph deep learning framework.
Stars: ✭ 2,701 (+2969.32%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
GNN-Recommender-Systems
An index of recommendation algorithms that are based on Graph Neural Networks.
Stars: ✭ 505 (+473.86%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Graph Based Deep Learning Literature
links to conference publications in graph-based deep learning
Stars: ✭ 3,428 (+3795.45%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
SelfGNN
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which appeared in The International Workshop on Self-Supervised Learning for the Web (SSL'21) @ the Web Conference 2021 (WWW'21).
Stars: ✭ 24 (-72.73%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Traffic-Prediction-Open-Code-Summary
Summary of open source code for deep learning models in the field of traffic prediction
Stars: ✭ 58 (-34.09%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Literatures-on-GNN-Acceleration
A reading list for deep graph learning acceleration.
Stars: ✭ 50 (-43.18%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
SimP-GCN
Implementation of the WSDM 2021 paper "Node Similarity Preserving Graph Convolutional Networks"
Stars: ✭ 43 (-51.14%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
DCGCN
Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning (authors' MXNet implementation for the TACL19 paper)
Stars: ✭ 73 (-17.05%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Representation Learning on Graphs with Jumping Knowledge Networks
Representation Learning on Graphs with Jumping Knowledge Networks
Stars: ✭ 31 (-64.77%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
graphml-tutorials
Tutorials for Machine Learning on Graphs
Stars: ✭ 125 (+42.05%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
Spectral-Designed-Graph-Convolutions
Codes for "Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks" paper
Stars: ✭ 39 (-55.68%)
Mutual labels:  graph-convolutional-networks, graph-neural-networks
kGCN
A graph-based deep learning framework for life science
Stars: ✭ 91 (+3.41%)
Mutual labels:  graph-convolutional-networks
Walk-Transformer
From Random Walks to Transformer for Learning Node Embeddings (ECML-PKDD 2020) (In Pytorch and Tensorflow)
Stars: ✭ 26 (-70.45%)
Mutual labels:  graph-neural-networks
awesome-graph-explainability-papers
Papers about explainability of GNNs
Stars: ✭ 153 (+73.86%)
Mutual labels:  graph-neural-networks
OpenHGNN
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL.
Stars: ✭ 264 (+200%)
Mutual labels:  graph-neural-networks
gemnet pytorch
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)
Stars: ✭ 80 (-9.09%)
Mutual labels:  graph-neural-networks
graph-nvp
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Stars: ✭ 69 (-21.59%)
Mutual labels:  graph-convolutional-networks

ProteinGCN: Protein model quality assessment using Graph Convolutional Networks

Source code for the paper: ProteinGCN: Protein model quality assessment using Graph Convolutional Networks

Overview of ProteinGCN: Given a protein structure, it first generates a protein graph and uses GCN to learn the atom embeddings. Then, it pools the atom embeddings to generate residue-level embeddings. The residue embeddings are passed through a non-linear fully connected layer to predict the local scores. Further, the residue embeddings are pooled to generate a global protein embedding. Similar to residue embeddings, this is used to predict the global score.

Dependencies

  • Compatible with PyTorch 1.0 and Python 3.x.
  • Dependencies can be installed using the requirements.txt file.

Dataset:

  • We use Rosetta-300k to train the ProteinGCN model and test it on both Rosetta-300k and CASP13 dataset for local(residue) and global Quality Assessment predictions.

Training model:

  1. Install all the requirements by executing pip install -r requirements.txt.

  2. Install required protein .pdb processing library by executing sh preprocess.sh which clones and installs this github repository.

  3. Next execute python preprocess_pdb_to_pkl.py script which creates the required .pkl files from the dataset to be used for model training. It defaults to a sample dataset provided with the code at ./data/. To use the original datasets, please change the paths accordingly.

  4. To start a training run:

python train.py trial_run --epochs 10

Once successfully run, this creates a folder by the name trial_run under the path ./data/pkl/results/ which contains the test results test_results.csv (where each row has the protein model name, target global score, predicted global score, target local scores, and predicted local scores) and best model checkpoint model_best.pth.tar. Rest of the training arguments and the defaults can be found in arguments.py. We support multi-gpu training using PyTorch DataParallel on a single server by default. To enable multi-gpu training, just set the required number of gpus in CUDA_VISIBLE_DEVICES environment.

  1. To get the final pearson correlation scores, run:
python correlation.py -file ./data/pkl/results/trial_run/test_results.csv

Running inference using pretrained ProteinGCN on new models:

  1. For running inference on new models, the preprocessing steps mentioned in step 1-3 above need to be followed for the new data. This will convert the pdb files to pickle files required by the model. Please note that based on the specific use-cases, some changes might be required in the preprocess_pdb_to_pkl.py file:

    1. Evaluating the performance of ProteinGCN for new models: To evaluate model performance, ground truth global and local scores should be available for the new models. The function get_targets should be changed accordingly to extract these targets from a given protein pdb filename.
    2. Using ProteinGCN to predict scores for new models: In this use-case there might not be ground truth global and local scores, hence the get_targets function should be modified to just return a fixed value (say 1) for global and local scores. Also, calculating the correlations is not possible here.
  2. We have published our best ProteinGCN model that was trained on Rosetta-300k dataset. To run this pretrained model on the preprocessed data, execute:

python train.py trial_testrun --pretrained ./pretrained/pretrained.pth.tar --epochs 0 --train 0 --val 0 --test 1

The data directory currently defaults to the sample data provided with the repository. To change the directories to the new data, please check the arguments.py file and change accordingly.

Please cite the following paper if you use this code in your work.

@article {Sanyal2020.04.06.028266,
	author = {Sanyal, Soumya and Anishchenko, Ivan and Dagar, Anirudh and Baker, David and Talukdar, Partha},
	title = {ProteinGCN: Protein model quality assessment using Graph Convolutional Networks},
	year = {2020},
	doi = {10.1101/2020.04.06.028266},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2020/04/07/2020.04.06.028266},
	journal = {bioRxiv}
}

For any clarification, comments, or suggestions please create an issue or contact Soumya.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].