All Projects → benedekrozemberczki → LabelPropagation

benedekrozemberczki / LabelPropagation

Licence: GPL-3.0 license
A NetworkX implementation of Label Propagation from a "Near Linear Time Algorithm to Detect Community Structures in Large-Scale Networks" (Physical Review E 2008).

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to LabelPropagation

Awesome Community Detection
A curated list of community detection research papers with implementations.
Stars: ✭ 1,874 (+1755.45%)
Mutual labels:  clustering, community-detection, unsupervised-learning, graph-clustering
watset-java
An implementation of the Watset clustering algorithm in Java.
Stars: ✭ 24 (-76.24%)
Mutual labels:  clustering, community-detection, graph-clustering
M-NMF
An implementation of "Community Preserving Network Embedding" (AAAI 2017)
Stars: ✭ 119 (+17.82%)
Mutual labels:  clustering, community-detection, unsupervised-learning
EgoSplitting
A NetworkX implementation of "Ego-splitting Framework: from Non-Overlapping to Overlapping Clusters" (KDD 2017).
Stars: ✭ 78 (-22.77%)
Mutual labels:  clustering, community-detection, graph-clustering
dti-clustering
(NeurIPS 2020 oral) Code for "Deep Transformation-Invariant Clustering" paper
Stars: ✭ 60 (-40.59%)
Mutual labels:  clustering, unsupervised-learning
ML2017FALL
Machine Learning (EE 5184) in NTU
Stars: ✭ 66 (-34.65%)
Mutual labels:  clustering, unsupervised-learning
Unsupervised Classification
SCAN: Learning to Classify Images without Labels (ECCV 2020), incl. SimCLR.
Stars: ✭ 605 (+499.01%)
Mutual labels:  clustering, unsupervised-learning
Self Supervised Learning Overview
📜 Self-Supervised Learning from Images: Up-to-date reading list.
Stars: ✭ 73 (-27.72%)
Mutual labels:  clustering, unsupervised-learning
treecut
Find nodes in hierarchical clustering that are statistically significant
Stars: ✭ 26 (-74.26%)
Mutual labels:  clustering, unsupervised-learning
Minisom
🔴 MiniSom is a minimalistic implementation of the Self Organizing Maps
Stars: ✭ 801 (+693.07%)
Mutual labels:  clustering, unsupervised-learning
Text Summarizer
Python Framework for Extractive Text Summarization
Stars: ✭ 96 (-4.95%)
Mutual labels:  clustering, unsupervised-learning
MVGL
TCyb 2018: Graph learning for multiview clustering
Stars: ✭ 26 (-74.26%)
Mutual labels:  clustering, unsupervised-learning
Keras deep clustering
How to do Unsupervised Clustering with Keras
Stars: ✭ 202 (+100%)
Mutual labels:  clustering, unsupervised-learning
L2c
Learning to Cluster. A deep clustering strategy.
Stars: ✭ 262 (+159.41%)
Mutual labels:  clustering, unsupervised-learning
Unsupervised-Learning-in-R
Workshop (6 hours): Clustering (Hdbscan, LCA, Hopach), dimension reduction (UMAP, GLRM), and anomaly detection (isolation forests).
Stars: ✭ 34 (-66.34%)
Mutual labels:  clustering, unsupervised-learning
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-82.18%)
Mutual labels:  clustering, unsupervised-learning
Danmf
A sparsity aware implementation of "Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection" (CIKM 2018).
Stars: ✭ 161 (+59.41%)
Mutual labels:  clustering, unsupervised-learning
Gemsec
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Stars: ✭ 210 (+107.92%)
Mutual labels:  clustering, unsupervised-learning
LinearCorex
Fast, linear version of CorEx for covariance estimation, dimensionality reduction, and subspace clustering with very under-sampled, high-dimensional data
Stars: ✭ 39 (-61.39%)
Mutual labels:  clustering, unsupervised-learning
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-83.17%)
Mutual labels:  clustering, unsupervised-learning

Label Propagation Arxiv repo size benedekrozemberczki

A NetworkX implementation of Near Linear Time algorithm to Detect Community Structures in Large-Scale Networks (Physical Review E 2008).

Abstract

Community detection and analysis is an important methodology for understanding the organization of various real-world networks and has applications in problems as diverse as consensus formation in social communities or the identification of functional modules in biochemical networks. Currently used algorithms that identify the community structures in large-scale real-world networks require a priori information such as the number and sizes of communities or are computationally expensive. In this paper we investigate a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a pre-defined objective function nor prior information about the communities. In our algorithm every node is initialized with a unique label and at every step each node adopts the label that most of its neighbors currently have. In this iterative process densely connected groups of nodes form a consensus on a unique label to form communities. We validate the algorithm by applying it to networks whose community structures are known. We also demonstrate that the algorithm takes an almost linear time and hence it is computationally less expensive than what was possible so far. .

The model is now also available in the package Karate Club.

This repository provides an implementation for Label Propagation as described in the paper:

Near linear Time Algorithm to Detect Community Structures in Large-scale Networks. Usha Nandini Raghavan, Reka Albert, Soundar Kumara. Phyical Review E, 2008. [Paper]

Requirements

The codebase is implemented in Python 3.5.2 | Anaconda 4.2.0 (64-bit). Package versions used for development are just below.

networkx          2.4
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
jsonschema        2.6.0
python-louvain    0.11
texttable         0.15.0

Datasets

The code takes an input graph in a csv file. Every row indicates an edge between two nodes separated by a comma. The first row is a header. Nodes should be indexed starting with 0. Sample graphs for the `Facebook Politicians` dataset is included in the `data/` directory.

Options

Creating a clustering is handled by the src/label_propagation.py script which provides the following command line arguments.

Model options

  --input               STR    Input graph path.                          Default is `data/politician_edges.csv`.                                     
  --assignment-output   STR    Node-cluster assignment dictionary path.   Default is `output/politician.json`.
  --weighing            STR    Weighting strategy.                        Default is `overlap`.
  --rounds              INT    Number of iterations.                      Default is 30.
  --seed                INT    Initial seed           .                   Default is 42.

Examples

The following commands create cluster assignments and writes them to disk.

Creating communities for the default dataset with the default hyperparameter settings.

$ python src/label_propagation.py

Using unit weighted label propagation.

$ python src/label_propagation.py --weighting unit

Changing the random seed.

$ python src/label_propagation.py --seed 32

Using label propagation with 100 iteration rounds.

$ python src/label_propagation.py --rounds 100

License


Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].