All Projects → MLDroid → graph2vec_tf

MLDroid / graph2vec_tf

Licence: other
This repository contains the "tensorflow" implementation of our paper "graph2vec: Learning distributed representations of graphs".

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to graph2vec tf

word2vec-pytorch
Extremely simple and fast word2vec implementation with Negative Sampling + Sub-sampling
Stars: ✭ 145 (-0.68%)
Mutual labels:  skipgram
Real Time Social Media Mining
DevOps pipeline for Real Time Social/Web Mining
Stars: ✭ 22 (-84.93%)
Mutual labels:  kdd
graphkit-learn
A python package for graph kernels, graph edit distances, and graph pre-image problem.
Stars: ✭ 87 (-40.41%)
Mutual labels:  graph-kernels
PGD
A Parallel Graphlet Decomposition Library for Large Graphs
Stars: ✭ 68 (-53.42%)
Mutual labels:  graph-kernels
EgoSplitting
A NetworkX implementation of "Ego-splitting Framework: from Non-Overlapping to Overlapping Clusters" (KDD 2017).
Stars: ✭ 78 (-46.58%)
Mutual labels:  kdd
FairAI
This is a collection of papers and other resources related to fairness.
Stars: ✭ 55 (-62.33%)
Mutual labels:  kdd
GraphDBLP
a Graph-based instance of DBLP
Stars: ✭ 33 (-77.4%)
Mutual labels:  kdd
game2vec
TensorFlow implementation of word2vec applied on https://www.kaggle.com/tamber/steam-video-games dataset, using both CBOW and Skip-gram.
Stars: ✭ 62 (-57.53%)
Mutual labels:  skipgram
Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (-67.12%)
Mutual labels:  skipgram
Awesome Graph Classification
A collection of important graph embedding, classification and representation learning papers with implementations.
Stars: ✭ 4,309 (+2851.37%)
Mutual labels:  graph-kernels
icml17 knn
Deriving Neural Architectures from Sequence and Graph Kernels
Stars: ✭ 59 (-59.59%)
Mutual labels:  graph-kernels

graph2vec

This repository contains the "tensorflow" implementation of our paper "graph2vec: Learning distributed representations of graphs". The paper could be found at: https://arxiv.org/pdf/1707.05005.pdf

Dependencies

This code is developed in python 2.7. It is ran and tested on Ubuntu 16.04. It uses the following python packages:

  1. tensorflow (version == 1.4.0)
  2. networkx (version <= 2.0)
  3. scikit-learn (+scipy, +numpy)
The procedure for setting up graph2vec is as follows:
1. git clone the repository (command: git clone https://github.com/MLDroid/graph2vec_tf.git )
2. untar the data.tar.gz tarball
The procedure for obtaining rooted graph vectors using graph2vec and performing graph classification is as follows:
1. move to the folder "src" (command: cd src) (also make sure that kdd 2015 paper's (Deep Graph Kernels) datasets are available in '../data/kdd_datasets/dir_graphs/')
2. run main.py --corpus <dataset of graph files> --class_labels_file_name <file containing class labels of graphs to be used for graph classification> file to:
	*Generate the weisfeiler-lehman kernel's rooted subgraphs from all the graphs 
	*Train skipgram model to learn graph embeddings. The same will be dumped in ../embeddings/ folder
	*Perform graph classification using the graph embeddings generated in the above step
3. example: 
	*python main.py --corpus ../data/kdd_datasets/mutag --class_labels_file_name ../data/kdd_datasets/mutag.Labels 
	*python main.py --corpus ../data/kdd_datasets/proteins --class_labels_file_name ../data/kdd_datasets/proteins.Labels --batch_size 16 --embedding_size 128 --num_negsample 5

Other command line args:

optional arguments:
	-h, --help            show this help message and exit
	-c CORPUS, --corpus CORPUS
			        Path to directory containing graph files to be used
			        for graph classification or clustering
	-l CLASS_LABELS_FILE_NAME, --class_labels_file_name CLASS_LABELS_FILE_NAME
			        File name containg the name of the sample and the
			        class labels
	-o OUTPUT_DIR, --output_dir OUTPUT_DIR
			        Path to directory for storing output embeddings
	-b BATCH_SIZE, --batch_size BATCH_SIZE
			        Number of samples per training batch
	-e EPOCHS, --epochs EPOCHS
			        Number of iterations the whole dataset of graphs is
			        traversed
	-d EMBEDDING_SIZE, --embedding_size EMBEDDING_SIZE
			        Intended graph embedding size to be learnt
	-neg NUM_NEGSAMPLE, --num_negsample NUM_NEGSAMPLE
			        Number of negative samples to be used for training
	-lr LEARNING_RATE, --learning_rate LEARNING_RATE
			        Learning rate to optimize the loss function

	--wlk_h WLK_H         Height of WL kernel (i.e., degree of rooted subgraph
			        features to be considered for representation learning)
	-lf LABEL_FILED_NAME, --label_filed_name LABEL_FILED_NAME
			        Label field to be used for coloring nodes in graphs
			        using WL kenrel

Contact

In case of queries, please email: [email protected] OR [email protected]

Reference

Please consider citing the follow paper when you use this code.
@article{narayanangraph2vec,
  title={graph2vec: Learning distributed representations of graphs},
  author={Narayanan, Annamalai and Chandramohan, Mahinthan and Venkatesan, Rajasekar and Chen, Lihui and Liu, Yang}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].