All Projects → aimat-lab → gcnn_keras

aimat-lab / gcnn_keras

Licence: MIT license
Graph convolution with tf.keras

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to gcnn keras

Erdos.jl
A library for graph analysis written Julia.
Stars: ✭ 37 (-21.28%)
Mutual labels:  graph-algorithms, graphs, networks
GLOM-TensorFlow
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data
Stars: ✭ 32 (-31.91%)
Mutual labels:  keras-tensorflow, tensorflow2
jgrapht
Master repository for the JGraphT project
Stars: ✭ 2,259 (+4706.38%)
Mutual labels:  graph-algorithms, graphs
G-SimCLR
This is the code base for paper "G-SimCLR : Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling" by Souradip Chakraborty, Aritra Roy Gosthipaty and Sayak Paul.
Stars: ✭ 69 (+46.81%)
Mutual labels:  keras-tensorflow, tensorflow2
Advanced-Shortest-Paths-Algorithms
Java Code for Contraction Hierarchies Algorithm, A-Star Algorithm and Bidirectional Dijkstra Algorithm. Tested and Verified Code.
Stars: ✭ 63 (+34.04%)
Mutual labels:  graph-algorithms, graphs
js-data-structures
🌿 Data structures for JavaScript
Stars: ✭ 56 (+19.15%)
Mutual labels:  graphs, networks
nxontology
NetworkX-based Python library for representing ontologies
Stars: ✭ 45 (-4.26%)
Mutual labels:  graphs, networks
gnn-lspe
Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations), ICLR 2022
Stars: ✭ 165 (+251.06%)
Mutual labels:  graphs, molecules
kglib
TypeDB-ML is the Machine Learning integrations library for TypeDB
Stars: ✭ 523 (+1012.77%)
Mutual labels:  graphs, graph-networks
cytoscape-sbgn-stylesheet
View biological networks via Cytoscape.js and sbgn-ml
Stars: ✭ 47 (+0%)
Mutual labels:  graphs, networks
TF2DeepFloorplan
TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.
Stars: ✭ 98 (+108.51%)
Mutual labels:  keras-tensorflow, tensorflow2
GradCAM and GuidedGradCAM tf2
Implementation of GradCAM & Guided GradCAM with Tensorflow 2.x
Stars: ✭ 16 (-65.96%)
Mutual labels:  keras-tensorflow, tensorflow2
Pygraphblas
GraphBLAS for Python
Stars: ✭ 252 (+436.17%)
Mutual labels:  graph-algorithms, graphs
potato-disease-classification
Potato Disease Classification - Training, Rest APIs, and Frontend to test.
Stars: ✭ 95 (+102.13%)
Mutual labels:  keras-tensorflow, tensorflow2
Graphav
A Graph Algorithms Visualizer built using React, Typescript and Styled Components.
Stars: ✭ 111 (+136.17%)
Mutual labels:  graph-algorithms, graphs
Evalne
Source code for EvalNE, a Python library for evaluating Network Embedding methods.
Stars: ✭ 67 (+42.55%)
Mutual labels:  graph-algorithms, graphs
Leaderboardx
A tool for building graphs quickly
Stars: ✭ 13 (-72.34%)
Mutual labels:  graph-algorithms, graphs
Pepper Robot Programming
Pepper Programs : Object Detection Real Time without ROS
Stars: ✭ 29 (-38.3%)
Mutual labels:  graph-algorithms, graphs
labml
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
Stars: ✭ 1,213 (+2480.85%)
Mutual labels:  keras-tensorflow, tensorflow2
kaliningraph
🕸️ Graphs, finite fields and discrete dynamical systems in Kotlin
Stars: ✭ 62 (+31.91%)
Mutual labels:  graph-algorithms, graphs

GitHub release (latest by date) Documentation Status PyPI version PyPI - Downloads kgcnn_unit_tests DOI GitHub GitHub issues Maintenance

Keras Graph Convolution Neural Networks

A set of layers for graph convolutions in TensorFlow Keras that use RaggedTensors.

General | Requirements | Installation | Documentation | Implementation details | Literature | Datasets | Examples | Issues | Citing | References

General

The package in kgcnn contains several layer classes to build up graph convolution models. Some models are given as an example. A documentation is generated in docs. Any comments, suggestions or help are very welcome!

Requirements

For kgcnn, usually the latest version of tensorflow is required, but is listed as extra requirements in the setup.py for simplicity. Additional python packages are placed in the setup.py requirements and are installed automatically. Packages which must be installed manually for full functionality:

  • tensorflow>=2.4.1
  • rdkit>=2020.03.4
  • openbabel>=3.0.1
  • pymatgen>=??.??.??

Installation

Clone repository or latest release and install with editable mode:

pip install -e ./gcnn_keras

or latest release via Python Package Index.

pip install kgcnn

Documentation

Auto-documentation is generated at https://kgcnn.readthedocs.io/en/latest/index.html .

Implementation details

Representation

The most frequent usage for graph convolutions is either node or graph classification. As for their size, either a single large graph, e.g. citation network or small (batched) graphs like molecules have to be considered. Graphs can be represented by an index list of connections plus feature information. Typical quantities in tensor format to describe a graph are listed below.

  • nodes: Node-list of shape (batch, [N], F) where N is the number of nodes and F is the node feature dimension.
  • edges: Edge-list of shape (batch, [M], F) where M is the number of edges and F is the edge feature dimension.
  • indices: Connection-list of shape (batch, [M], 2) where M is the number of edges. The indices denote a connection of incoming or receiving node i and outgoing or sending node j as (i, j).
  • state: Graph state information of shape (batch, F) where F denotes the feature dimension.

A major issue for graphs is their flexible size and shape, when using mini-batches. Here, for a graph implementation in the spirit of keras, the batch dimension should be kept also in between layers. This is realized by using RaggedTensors.

Input

Graph tensors for edge-indices or attributes for multiple graphs is passed to the model in form of ragged tensors of shape (batch, None, Dim) where Dim denotes a fixed feature or index dimension. Such a ragged tensor has ragged_rank=1 with one ragged dimension indicated by None and is build from a value plus partition tensor. For example, the graph structure is represented by an index-list of shape (batch, None, 2) with index of incoming or receiving node i and outgoing or sending node j as (i, j). Note, an additional edge with (j, i) is required for undirected graphs. A ragged constant can be easily created and passed to a model:

import tensorflow as tf
import numpy as np
idx = [[[0, 1], [1, 0]], [[0, 1], [1, 2], [2, 0]], [[0, 0]]]  # batch_size=3
# Get ragged tensor of shape (3, None, 2)
print(tf.ragged.constant(idx, ragged_rank=1, inner_shape=(2, )).shape)
print(tf.RaggedTensor.from_row_lengths(np.concatenate(idx), [len(i) for i in idx]).shape) 

Model

Models can be set up in a functional way. Example message passing from fundamental operations:

import tensorflow.keras as ks
from kgcnn.layers.gather import GatherNodes
from kgcnn.layers.modules import DenseEmbedding, LazyConcatenate  # ragged support
from kgcnn.layers.pooling import PoolingLocalMessages, PoolingNodes

n = ks.layers.Input(shape=(None, 3), name='node_input', dtype="float32", ragged=True)
ei = ks.layers.Input(shape=(None, 2), name='edge_index_input', dtype="int64", ragged=True)

n_in_out = GatherNodes()([n, ei])
node_messages = DenseEmbedding(10, activation='relu')(n_in_out)
node_updates = PoolingLocalMessages()([n, node_messages, ei])
n_node_updates = LazyConcatenate(axis=-1)([n, node_updates])
n_embedd = DenseEmbedding(1)(n_node_updates)
g_embedd = PoolingNodes()(n_embedd)

message_passing = ks.models.Model(inputs=[n, ei], outputs=g_embedd)

or via sub-classing of the message passing base layer. Where only message_function and update_nodes must be implemented:

from kgcnn.layers.conv.message import MessagePassingBase
from kgcnn.layers.modules import DenseEmbedding, LazyAdd

class MyMessageNN(MessagePassingBase):
  def __init__(self, units, **kwargs):
    super(MyMessageNN, self).__init__(**kwargs)
    self.dense = DenseEmbedding(units)
    self.add = LazyAdd(axis=-1)
  
  def message_function(self, inputs, **kwargs):
    n_in, n_out, edges = inputs
    return self.dense(n_out)
  
  def update_nodes(self, inputs, **kwargs):
    nodes, nodes_update = inputs
    return self.add([nodes, nodes_update])

Literature

A version of the following models and variants thereof are implemented in literature:

... and many more (click to expand).

Datasets

How to construct ragged tensors is shown above. Moreover, some data handling classes are given in kgcnn.data. Graphs are represented by a dictionary of (numpy) tensors GraphDict and are stored in a list MemoryGraphList. Both must fit into memory and are supposed to be handled just like a python dict or list, respectively.

from kgcnn.data.base import GraphDict, MemoryGraphList
# Single graph.
graph = GraphDict({"edge_indices": [[0, 1], [1, 0]]})
print(graph)
# List of graph dicts.
graph_list = MemoryGraphList([graph, {"edge_indices": [[0, 0]]}, {}])
graph_list.clean(["edge_indices"])  # Remove graphs without property
graph_list.obtain_property("edge_indices")  # opposite is assign_property()
graph_list.tensor([{"name": "edge_indices", "ragged": True}]) # config of layers.Input; makes copy.

The MemoryGraphDataset inherits from MemoryGraphList but must be initialized with file information on disk that points to a data_directory for the dataset. The data_directory can have a subdirectory for files and/or single file such as a CSV file:

├── data_directory
    ├── file_directory
    │   ├── *.*
    │   └── ... 
    ├── file_name
    └── dataset_name.pickle

A base dataset class is created with path and name information:

from kgcnn.data.base import MemoryGraphDataset
dataset = MemoryGraphDataset(data_directory="ExampleDir/", 
                             dataset_name="Example",
                             file_name=None, file_directory=None)

The subclasses QMDataset, MoleculeNetDataset and GraphTUDataset further have functions required for the specific dataset type to convert and process files such as '.txt', '.sdf', '.xyz' etc. Most subclasses implement prepare_data() and read_in_memory() with dataset dependent arguments. An example for MoleculeNetDataset is shown below. For more details find tutorials in notebooks.

from kgcnn.data.moleculenet import MoleculeNetDataset
# File directory and files must exist. 
# Here 'ExampleDir' and 'ExampleDir/data.csv' with columns "smiles" and "label".
dataset = MoleculeNetDataset(dataset_name="Example",
                             data_directory="ExampleDir/",
                             file_name="data.csv")
dataset.prepare_data(overwrite=True, smiles_column_name="smiles", add_hydrogen=True,
                     make_conformers=True, optimize_conformer=True, num_workers=None)
dataset.read_in_memory(label_column_name="label",  add_hydrogen=False, 
                       has_conformers=True)

In data.datasets there are graph learning benchmark datasets as subclasses which are being downloaded from e.g. popular graph archives like TUDatasets or MoleculeNet. The subclasses GraphTUDataset2020 and MoleculeNetDataset2018 download and read the available datasets by name. There are also specific dataset subclass for each dataset to handle additional processing or downloading from individual sources:

from kgcnn.data.datasets.MUTAGDataset import MUTAGDataset
dataset = MUTAGDataset()  # inherits from GraphTUDataset2020

Downloaded datasets are stored in ~/.kgcnn/datasets on your computer. Please remove them manually, if no longer required.

Examples

A set of example training can be found in training. Training scripts are configurable with a hyperparameter config file and command line arguments regarding model and dataset.

Issues

Some known issues to be aware of, if using and making new models or layers with kgcnn.

  • RaggedTensor can not yet be used as a keras model output (issue), which has been mostly resolved in TF 2.8.
  • Using RaggedTensor's for arbitrary ragged rank apart from kgcnn.layers.modules can cause significant performance decrease. This is due to shape check during add, multiply or concatenate (we think). We therefore use lazy add and concat in the kgcnn.layers.modules layers or directly operate on the value tensor for possible rank.
  • With tensorflow version <=2.5 there is a problem with numpy version >=1.20 also affect kgcnn (issue)

Citing

If you want to cite this repo, please refer to our paper:

@article{REISER2021100095,
title = {Graph neural networks in TensorFlow-Keras with RaggedTensor representation (kgcnn)},
journal = {Software Impacts},
pages = {100095},
year = {2021},
issn = {2665-9638},
doi = {https://doi.org/10.1016/j.simpa.2021.100095},
url = {https://www.sciencedirect.com/science/article/pii/S266596382100035X},
author = {Patrick Reiser and Andre Eberhard and Pascal Friederich}
}

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].