All Projects → nphdang → GE-FSG

nphdang / GE-FSG

Licence: other
Graph Embedding via Frequent Subgraphs

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to GE-FSG

Awesome Graph Classification
A collection of important graph embedding, classification and representation learning papers with implementations.
Stars: ✭ 4,309 (+10948.72%)
Mutual labels:  graph-embedding, graph-classification, graph-representation-learning
doc2vec-golang
doc2vec , word2vec, implemented by golang. word embedding representation
Stars: ✭ 33 (-15.38%)
Mutual labels:  word2vec, doc2vec
FEATHER
The reference implementation of FEATHER from the CIKM '20 paper "Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models".
Stars: ✭ 34 (-12.82%)
Mutual labels:  graph-embedding, graph-classification
walklets
A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
Stars: ✭ 94 (+141.03%)
Mutual labels:  word2vec, graph-embedding
QGNN
Quaternion Graph Neural Networks (ACML 2021) (Pytorch and Tensorflow)
Stars: ✭ 31 (-20.51%)
Mutual labels:  graph-classification, graph-representation-learning
resolutions-2019
A list of data mining and machine learning papers that I implemented in 2019.
Stars: ✭ 19 (-51.28%)
Mutual labels:  graph-embedding, graph-classification
dgcnn
Clean & Documented TF2 implementation of "An end-to-end deep learning architecture for graph classification" (M. Zhang et al., 2018).
Stars: ✭ 21 (-46.15%)
Mutual labels:  graph-embedding, graph-classification
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-58.97%)
Mutual labels:  word2vec, graph-embedding
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (-35.9%)
Mutual labels:  word2vec, doc2vec
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-23.08%)
Mutual labels:  word2vec, doc2vec
doc2vec-api
document embedding and machine learning script for beginners
Stars: ✭ 92 (+135.9%)
Mutual labels:  word2vec, doc2vec
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Stars: ✭ 2,936 (+7428.21%)
Mutual labels:  word2vec, graph-embedding
RolX
An alternative implementation of Recursive Feature and Role Extraction (KDD11 & KDD12)
Stars: ✭ 52 (+33.33%)
Mutual labels:  word2vec, graph-embedding
altair
Assessing Source Code Semantic Similarity with Unsupervised Learning
Stars: ✭ 42 (+7.69%)
Mutual labels:  word2vec, doc2vec
Word2vec
Python interface to Google word2vec
Stars: ✭ 2,370 (+5976.92%)
Mutual labels:  word2vec, doc2vec
Movietaster Open
A practical movie recommend project based on Item2vec.
Stars: ✭ 253 (+548.72%)
Mutual labels:  word2vec
Vaaku2Vec
Language Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (+74.36%)
Mutual labels:  word2vec
Aravec
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Stars: ✭ 239 (+512.82%)
Mutual labels:  word2vec
Book deeplearning in pytorch source
Stars: ✭ 236 (+505.13%)
Mutual labels:  word2vec
GNNLens2
Visualization tool for Graph Neural Networks
Stars: ✭ 155 (+297.44%)
Mutual labels:  graph-representation-learning

GE-FSG: Learning Graph Embeddings via Frequent Subgraphs

This is the implementation of the GE-FSG method in the paper "Learning Graph Representation via Frequent Subgraphs", SDM 2018: https://epubs.siam.org/doi/10.1137/1.9781611975321.35

Introduction

A graph consists of nodes and edges. Each graph has a label called graph label. Similarly, each node/edge can also has a label called node label/edge label. For example, a chemical compound is a graph whose nodes correspond to the atoms of the compound and edges correspond to chemical bonds.

To apply machine learning tasks such as classification and clustering to graphs, we need to represent each graph as a feature vector since machine learning methods typically require vectors as their input. This task is challenging since graphs have no feature vectors by default.

We propose GE-FSG which learns feature vectors (aka embeddings or representations) for graphs. GE-FSG combines a recently introduced neural document embedding model with a traditional pattern mining technique. It has two main steps: (1) decompose each graph into a set of frequent subgraphs (FSGs) and (2) learn an embedding for each graph by predicting its belonging FSGs. To this end, graphs which contain similar FSGs will be mapped to nearby points on the vector space.

GE-FSG: Main idea

Graph visualization demonstration

Graph embeddings learnt by GE-FSG and other methods are visualized using t-SNE.

Graph visualization

Installation

  1. Microsoft .NET Framework 4.0 (to run C# code to mine frequent subgraphs)
  2. gensim 3.4 (to run Doc2Vec model)
  3. networkx 2.1 (to read graphs in GraphML format)

How to run

  • Run "python main.py" to learn graph embeddings and classify graphs (note that you may need to change variables such as dataset, minimum support threshold, and embedding dimension in the code)
  • Run "python .\utilities\convert_graphs\gen_graph_dgk.py" to convert graph format used by Deep Graph Kernel to graph format used by GE-FSG
  • Run "python .\utilities\convert_graphs\gen_graph_gk.py" to convert graph format used by Graph Kernel Suite to graph format used by GE-FSG

For Linux environment

  • Please use the implementation in folder "For_Linux"

Tool to mine frequent subgraphs

  • File "fsg_miner.exe" can be used as a standalone tool to discover frequent subgraphs
  • It runs fast since it is implemented in parallel
  • Its source code is in "fsg_miner.zip"
  • Its parameters are as follows:
        -graphset <file>
        use graphs from <file> to mine FSGs
        -graphlabel <file>
        obtain graph labels from <file>
        -minsup <float>
        set minimum support threshold in [0,1]; default is 0.5
        -fsg <file>
        save discovered FSGs to <file> (optional)
        -output <file>
        convert each graph to a set of FSGs and save it to <file> (optional)

Reference

Dang Nguyen, Wei Luo, Tu Dinh Nguyen, Svetha Venkatesh, Dinh Phung (2018). Learning Graph Representation via Frequent Subgraphs. SDM 2018, San Diego, USA. SIAM, 306-314

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].