All Projects → benedekrozemberczki → FEATHER

benedekrozemberczki / FEATHER

Licence: MIT license
The reference implementation of FEATHER from the CIKM '20 paper "Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models".

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to FEATHER

Awesome Graph Classification
A collection of important graph embedding, classification and representation learning papers with implementations.
Stars: ✭ 4,309 (+12573.53%)
Mutual labels:  deepwalk, network-embedding, graph-kernel, node2vec, graph-embedding, graph-classification, graph2vec, node-embedding
resolutions-2019
A list of data mining and machine learning papers that I implemented in 2019.
Stars: ✭ 19 (-44.12%)
Mutual labels:  deepwalk, network-embedding, graph-kernel, node2vec, graph-embedding, graph-classification, node-classification, node-embedding
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-52.94%)
Mutual labels:  data-mining, deepwalk, network-embedding, node2vec, graph-embedding, graph2vec, node-embedding
walklets
A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
Stars: ✭ 94 (+176.47%)
Mutual labels:  deepwalk, node2vec, graph-embedding, node-classification, node-embedding, graph-convolution
PDN
The official PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf '21)
Stars: ✭ 44 (+29.41%)
Mutual labels:  deepwalk, graph-classification, node-classification, graph2vec
RolX
An alternative implementation of Recursive Feature and Role Extraction (KDD11 & KDD12)
Stars: ✭ 52 (+52.94%)
Mutual labels:  deepwalk, node2vec, graph-embedding, node-embedding
Awesome Community Detection
A curated list of community detection research papers with implementations.
Stars: ✭ 1,874 (+5411.76%)
Mutual labels:  deepwalk, networkx, node2vec
Euler
A distributed graph deep learning framework.
Stars: ✭ 2,701 (+7844.12%)
Mutual labels:  network-embedding, node2vec, graph-embedding
REGAL
Representation learning-based graph alignment based on implicit matrix factorization and structural embeddings
Stars: ✭ 78 (+129.41%)
Mutual labels:  representation-learning, network-embedding, node-embedding
M-NMF
An implementation of "Community Preserving Network Embedding" (AAAI 2017)
Stars: ✭ 119 (+250%)
Mutual labels:  deepwalk, representation-learning, node2vec
QGNN
Quaternion Graph Neural Networks (ACML 2021) (Pytorch and Tensorflow)
Stars: ✭ 31 (-8.82%)
Mutual labels:  graph-classification, node-classification
NMFADMM
A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (+14.71%)
Mutual labels:  deepwalk, node2vec
TriDNR
Tri-Party Deep Network Representation, IJCAI-16
Stars: ✭ 72 (+111.76%)
Mutual labels:  network-embedding, graph-embedding
Friends-Recommender-In-Social-Network
Friends Recommendation and Link Prediction in Social Netowork
Stars: ✭ 33 (-2.94%)
Mutual labels:  networkx, network-embedding
Graphembedding
Implementation and experiments of graph embedding algorithms.
Stars: ✭ 2,461 (+7138.24%)
Mutual labels:  deepwalk, node2vec
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Stars: ✭ 2,936 (+8535.29%)
Mutual labels:  data-mining, graph-embedding
dgcnn
Clean & Documented TF2 implementation of "An end-to-end deep learning architecture for graph classification" (M. Zhang et al., 2018).
Stars: ✭ 21 (-38.24%)
Mutual labels:  graph-embedding, graph-classification
ethereum-privacy
Profiling and Deanonymizing Ethereum Users
Stars: ✭ 37 (+8.82%)
Mutual labels:  representation-learning, network-embedding
Awesome Network Embedding
A curated list of network embedding techniques.
Stars: ✭ 2,379 (+6897.06%)
Mutual labels:  representation-learning, network-embedding
Evalne
Source code for EvalNE, a Python library for evaluating Network Embedding methods.
Stars: ✭ 67 (+97.06%)
Mutual labels:  data-mining, representation-learning

FEATHER

Arxiv codebeat badge repo size benedekrozemberczki

The Python reference implementation of FEATHER and FEATHER-G from the CIKM '20 paper Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models.


Abstract

In this paper, we propose a flexible notion of characteristic functions defined on graph vertices to describe the distribution of vertex features at multiple scales. We introduce FEATHER, a computationally efficient algorithm to calculate a specific variant of these characteristic functions where the probability weights of the characteristic function are defined as the transition probabilities of random walks. We argue that features extracted by this procedure are useful for node level machine learning tasks. We discuss the pooling of these node representations, resulting in compact descriptors of graphs that can serve as features for graph classification algorithms. We analytically prove that FEATHER describes isomorphic graphs with the same representation and exhibits robustness to data corruption. Using the node feature characteristic functions we define parametric models where evaluation points of the functions are learned parameters of supervised classifiers. Experiments on real world large datasets show that our proposed algorithm creates high quality representations, performs transfer learning efficiently, exhibits robustness to hyperparameter changes, and scales linearly with the input size.

This repository provides the reference implementation for FEATHER as described in the paper:

Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models. Benedek Rozemberczki and Rik Sarkar. CIKM, 2020.

The datasets are also available on SNAP.

The model is now also available in the package Karate Club.

Table of Contents

  1. Citing
  2. Requirements
  3. Options
  4. Examples

Citing

If you find FEATHER useful in your research, please consider citing the following paper:

@inproceedings{feather,
               title={{Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models}},
               author={Benedek Rozemberczki and Rik Sarkar},
               year={2020},
	       pages = {1325–1334},
	       booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20)},
	       organization={ACM},
}

Requirements

The codebase is implemented in Python 3.5.2. package versions used for development are just below.

networkx          2.4
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
texttable         1.5.0
scipy             1.1.0
argparse          1.1.0

Input

Node level

The code takes an input graph in a csv file. Every row indicates an edge between two nodes separated by a comma. The first row is a header. Nodes should be indexed starting with 0.

The feature matrix is dense it is assumed that it is stored as csv with comma separators. It has a header, and rows are separated by node identifiers (increasing). It should look like this:

Feature 1 Feature 2 Feature 3 Feature 4
3 0 1.37 1
1 1 2.54 -11
2 0 1.08 -12
1 1 1.22 -4
... ... ... ...
5 0 2.47 21

Graph level

The graphs are stored in a JSON file where keys are graph identifiers and values are edge lists. Graph identifiers are consecutive and start with 0. Each individual graph has nodes which are indexed starting with 0. We assume that graphs are connected.

{ 0: [[0, 1], [1, 2], [2, 3]],
  1: [[0, 1], [1, 2], [2, 0]],
  ...
  n: [[0, 1], [1, 2]]}

Options

Learning the embedding is handled by the src/main.py script which provides the following command line arguments.

Input and output options

  --graph-input      STR   Input edge list csv.      Default is `input/edges/ER_edges.csv`.
  --feature-input    STR   Input features csv.       Default is `input/features/ER_features.csv`.
  --graphs           STR   Input graphs json.        Default is `input/graphs/ER_graphs.json`.
  --output           STR   Embedding output path.    Default is `output/ER_node_embedding.csv`.

Model options

  --model-type    STR      FEATHER or FEATHER-G model.          Default is  `FEATHER`.
  --eval-points   INT      Number of evaluation points.         Default is  25.
  --order         INT      Matrix powers approximated.          Default is  5.
  --theta-max     FLOAT    Length of random walk per source.    Default is  2.5.

Examples

Training a FEATHER model.

$ python src/main.py

Changing the scale parameter to increase adjacency matrix powers.

$ python src/main.py --order 3

Decreasing the number of evaluation points.

$ python src/main.py --eval-points 25

Training a graph level FEATHER model with the default dataset.

$ python src/main.py --model-type FEATHER-G --output output/ER_graph_embedding.csv

License


Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].