All Projects → a-r-j → Graphein

a-r-j / Graphein

Licence: mit
Protein Graph Library

Projects that are alternatives of or similar to Graphein

Tdc
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Therapeutics
Stars: ✭ 291 (+58.15%)
Mutual labels:  jupyter-notebook, bioinformatics
Allensdk
code for reading and processing Allen Institute for Brain Science data
Stars: ✭ 200 (+8.7%)
Mutual labels:  jupyter-notebook, bioinformatics
Deep learning examples
Examples of using deep learning in Bioinformatics
Stars: ✭ 234 (+27.17%)
Mutual labels:  jupyter-notebook, bioinformatics
Deeppurpose
A Deep Learning Toolkit for DTI, Drug Property, PPI, DDI, Protein Function Prediction (Bioinformatics)
Stars: ✭ 342 (+85.87%)
Mutual labels:  jupyter-notebook, bioinformatics
Janggu
Deep learning infrastructure for bioinformatics
Stars: ✭ 174 (-5.43%)
Mutual labels:  jupyter-notebook, bioinformatics
Coursera Specializations
Solutions to assignments of Coursera Specializations - Deep learning, Machine learning, Algorithms & Data Structures, Image Processing and Python For Everybody
Stars: ✭ 72 (-60.87%)
Mutual labels:  jupyter-notebook, bioinformatics
Uncurl python
UNCURL is a tool for single cell RNA-seq data analysis.
Stars: ✭ 13 (-92.93%)
Mutual labels:  jupyter-notebook, bioinformatics
Gcp For Bioinformatics
GCP Essentials for Bioinformatics Researchers
Stars: ✭ 95 (-48.37%)
Mutual labels:  jupyter-notebook, bioinformatics
Object Oriented Programming Using Python
Python is a multi-paradigm programming language. Meaning, it supports different programming approach. One of the popular approach to solve a programming problem is by creating objects. This is known as Object-Oriented Programming (OOP).
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Autochecker4chinese
中文文本错别字检测以及自动纠错 / Autochecker & autocorrecter for chinese
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Bioinf Python
Python for Bioinformatics
Stars: ✭ 182 (-1.09%)
Mutual labels:  bioinformatics
Coms4995 S20
COMS W4995 Applied Machine Learning - Spring 20
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Sirajscodingchallenges
Code for Siraj Raval's Coding Challenges!
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Ai Algorithm Engineer Knowledge
努力成为一名合格有水平的AI算法工程师
Stars: ✭ 184 (+0%)
Mutual labels:  jupyter-notebook
Data Science For Covid 19
DS4C: Data Science for COVID-19 in South Korea
Stars: ✭ 184 (+0%)
Mutual labels:  jupyter-notebook
Www Coursera Downloader
This Jupyter Notebook will help you downloading Coursera videos, subtitles and quizzes (but not answering the quiz). It will automatically download and convert vtt subtitle files into srt. All resources downloaded are numbered according to their sequence.
Stars: ✭ 182 (-1.09%)
Mutual labels:  jupyter-notebook
Lstm networks
This is the code for "LSTM Networks - The Math of Intelligence (Week 8)" By Siraj Raval on Youtube
Stars: ✭ 182 (-1.09%)
Mutual labels:  jupyter-notebook
Gpu Jupyter
Leverage the flexibility of Jupyterlab through the power of your NVIDIA GPU to run your code from Tensorflow and Pytorch in collaborative notebooks on the GPU.
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Qhue
A very lightweight Python wrapper to the Philips Hue API
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook
Deep Ttf
Survival analsyis and time-to-failure predictive modeling using Weibull distributions and Recurrent Neural Networks in Keras
Stars: ✭ 183 (-0.54%)
Mutual labels:  jupyter-notebook

DOI:10.1101/2020.07.15.204701 Project Status: Active – The project has reached a stable, usable state and is being actively developed. Documentation Status Gitter chat License: MIT banner

Documentation | Paper

Protein Graph Library

This package provides functionality for producing a number of types of graph-based representations of proteins. We provide compatibility with standard formats, as well as graph objects designed for ease of use with popular deep learning libraries.

What's New?

  • Protein Graph Visualisation!
  • RNA Graph Construction from Dotbracket notation

Example usage

Creating a Protein Graph

from graphein.construct_graphs import  ProteinGraph

# Initialise ProteinGraph class
pg = ProteinGraph(granularity='CA', insertions=False, keep_hets=True,
                  node_featuriser='meiler', get_contacts_path='/Users/arianjamasb/github/getcontacts',
                  pdb_dir='examples/pdbs/',
                  contacts_dir='examples/contacts/',
                  exclude_waters=True, covalent_bonds=False, include_ss=True)

# Create residue-level graphs. Chain selection is either 'all' or a list e.g. ['A', 'B', 'D'] specifying the polypeptide chains to capture

# DGLGraph From PDB Accession Number
graph = pg.dgl_graph_from_pdb_code('3eiy', chain_selection='all')
# DGLGraph From PDB file
graph = pg.dgl_graph_from_pdb_file(file_path='examples/pdbs/pdb3eiy.pdb', contact_file='examples/contacts/3eiy_contacts.tsv', chain_selection='all')

# Create atom-level graphs
graph = pg._make_atom_graph(pdb_code='3eiy', graph_type='bigraph')

Creating a Protein Mesh

from graphein.construct_meshes import  ProteinMesh
# Initialise ProteinMesh class
pm = ProteinMesh()

# Pytorch3D Mesh Object from PDB Code
verts, faces, aux = pm.create_mesh(pdb_code='3eiy', out_dir='examples/meshes/')
# Pytorch3D Mesh Object from PDB File
verts, faces, aux = pm.create_mesh(pdb_file='examples/pdbs/pdb3eiy.pdb')

Creating an RNA Graph

from graphein.construct_graphs import RNAGraph
# Initialise RNAGraph Constructor
rg = RNAGraph()
# Build the graph from a dotbracket & optional sequence
rna = rg.dgl_graph_from_dotbracket('..(((((..(((...)))..)))))...', sequence='UUGGAGUACACAACCUGUACACUCUUUC')

Parameters

Graphs can be constructed according to walks through the graph in the figure below. banner

granularity: {'CA', 'CB', 'atom'} - specifies node-level granularity of graph
insertions: bool - keep atoms with multiple insertion positions
keep_hets: bool - keep hetatoms
node_featuriser: {'meiler', 'kidera'} low-dimensional embeddings of AA physico-chemical properties
pdb_dir: path to pdb files
contacts_dir: path to contact files generated by get_contacts
get_contacts_path: path to GetContacts installation
exclude_waters: bool - retain structural waters
covalent_bonds: bool - maintain covalent bond edges or just use intramolecular interactions
include_ss: bool - calculate protein SS and surface features using DSSP and assign them as node features

Installation

  1. Create env:

    conda create --name graphein python=3.7
    conda activate graphein
    
  2. Install GetContacts

    Installation Instructions

    MacOS

     # Install get_contact_ticc.py dependencies
     $ conda install scipy numpy scikit-learn matplotlib pandas cython seaborn
     $ pip install ticc==0.1.4
      
     # Install vmd-python dependencies
     $ conda install netcdf4 numpy pandas seaborn expat tk=8.5  # Alternatively use pip
     $ brew install netcdf pyqt # Assumes https://brew.sh/ is installed
    
     # Install vmd-python library
     $ conda install -c conda-forge vmd-python
    
     # Set up getcontacts library
     $ git clone https://github.com/getcontacts/getcontacts.git
     $ echo "export PATH=`pwd`/getcontacts:\$PATH" >> ~/.bash_profile
     $ source ~/.bash_profile
    
     # Test installation
     $ cd getcontacts/example/5xnd
     $ get_dynamic_contacts.py --topology 5xnd_topology.pdb \
                               --trajectory 5xnd_trajectory.dcd \
                               --itypes hb \
                               --output 5xnd_hbonds.tsv
    

    Linux

       
      # Make sure you have git and conda installed and then run
    
      # Install get_contact_ticc.py dependencies
      conda install scipy numpy scikit-learn matplotlib pandas cython
      pip install ticc==0.1.4
      
      # Set up vmd-python library
      conda install -c https://conda.anaconda.org/rbetz vmd-python
      
      # Set up getcontacts library
      git clone https://github.com/getcontacts/getcontacts.git
      echo "export PATH=`pwd`/getcontacts:\$PATH" >> ~/.bashrc
      source ~/.bashrc
    
    
    
  3. Install Biopython & RDKit:

    N.B. DGLLife requires rdkit==2018.09.3

    conda install biopython
    conda install -c conda-forge rdkit==2018.09.3
    
  4. Install DSSP:

    We use DSSP for computing some protein features

    $ conda install -c salilab dssp
    
  5. Install PyTorch, DGL and DGL LifeSci:

    N.B. Make sure to install appropriate version for your CUDA version

    # Install PyTorch: MacOS
    $ conda install pytorch torchvision -c pytorch                      # Only CPU Build
    
    # Install PyTorch: Linux
    $ conda install pytorch torchvision cpuonly -c pytorch              # For CPU Build
    $ conda install pytorch torchvision cudatoolkit=9.2 -c pytorch      # For CUDA 9.2 Build
    $ conda install pytorch torchvision cudatoolkit=10.1 -c pytorch     # For CUDA 10.1 Build
    $ conda install pytorch torchvision cudatoolkit=10.2 -c pytorch     # For CUDA 10.2 Build
    
    # Install DGL. N.B. We require 0.4.3 until compatibility with DGL 0.5.0+ is implemented
    $ pip install dgl==0.4.3
    
    # Install DGL LifeSci
    $ conda install -c dglteam dgllife
    
  6. Install PyTorch Geometric:

    $ pip install torch-scatter==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
    $ pip install torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
    $ pip install torch-cluster==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
    $ pip install torch-spline-conv==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
    $ pip install torch-geometric
    

    Where ${CUDA} and ${TORCH} should be replaced by your specific CUDA version (cpu, cu92, cu101, cu102) and PyTorch version (1.4.0, 1.5.0, 1.6.0), respectively

    N.B. Follow the instructions in the Torch-Geometric Docs to install the versions appropriate to your CUDA version.

  7. Install PyMol and IPyMol

    $ conda install -c schrodinger pymol
    $ git clone https://github.com/cxhernandez/ipymol
    $ cd ipymol
    $ pip install . 
    

    N.B. The PyPi package seems to be behind the github repo. We require functionality that is not present in the PyPi package in order to construct meshes.

  8. Install graphein:

    $ git clone https://www.github.com/a-r-j/graphein
    $ cd graphein
    $ pip install -e .
    

Citing Graphein

Please consider citing graphein if it proves useful in your work.

@article{Jamasb2020,
  doi = {10.1101/2020.07.15.204701},
  url = {https://doi.org/10.1101/2020.07.15.204701},
  year = {2020},
  month = jul,
  publisher = {Cold Spring Harbor Laboratory},
  author = {Arian Rokkum Jamasb and Pietro Lio and Tom Blundell},
  title = {Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Protein Structures}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].