All Projects → DerwenAI → disparity_filter

DerwenAI / disparity_filter

Licence: MIT License
Implements a disparity filter in Python, based on graphs in NetworkX, to extract the multiscale backbone of a complex weighted network (Serrano, et al., 2009)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to disparity filter

kglib
TypeDB-ML is the Machine Learning integrations library for TypeDB
Stars: ✭ 523 (+2976.47%)
Mutual labels:  graphs, graph-networks
Osmnx
OSMnx: Python for street networks. Retrieve, model, analyze, and visualize street networks and other spatial data from OpenStreetMap.
Stars: ✭ 3,357 (+19647.06%)
Mutual labels:  graphs, networkx
gqlalchemy
GQLAlchemy is a library developed with the purpose of assisting in writing and running queries on Memgraph. GQLAlchemy supports high-level connection to Memgraph as well as modular query builder.
Stars: ✭ 39 (+129.41%)
Mutual labels:  graphs, networkx
Stellargraph
StellarGraph - Machine Learning on Graphs
Stars: ✭ 2,235 (+13047.06%)
Mutual labels:  graphs, networkx
Algorithms
Free hands-on course with the implementation (in Python) and description of several computational, mathematical and statistical algorithms.
Stars: ✭ 117 (+588.24%)
Mutual labels:  graphs, networkx
football-graphs
Graphs and passing networks in football.
Stars: ✭ 81 (+376.47%)
Mutual labels:  graphs, networkx
Graph nets
Build Graph Nets in Tensorflow
Stars: ✭ 5,051 (+29611.76%)
Mutual labels:  graphs, graph-networks
nxontology
NetworkX-based Python library for representing ontologies
Stars: ✭ 45 (+164.71%)
Mutual labels:  graphs, networkx
gcnn keras
Graph convolution with tf.keras
Stars: ✭ 47 (+176.47%)
Mutual labels:  graphs, graph-networks
WordMat
WordMat is an add-in to MicroSoft Word enabling math functionality
Stars: ✭ 25 (+47.06%)
Mutual labels:  graphs
visa-chart-components
Visa Chart Components (VCC) is an accessibility focused, framework agnostic set of data experience design systems components for the web. VCC attempts to provide a toolset to enable developers to build equal data experiences for everyone, everywhere.
Stars: ✭ 81 (+376.47%)
Mutual labels:  graphs
dlaudio
Master thesis: Structured Auto-Encoder with application to Music Genre Recognition (code)
Stars: ✭ 14 (-17.65%)
Mutual labels:  graphs
netdiff
Python library for parsing network topology data (eg: dynamic routing protocols, OpenVPN, NetJSON, CNML) and detect changes.
Stars: ✭ 66 (+288.24%)
Mutual labels:  networkx
ntds 2016
Material for the EPFL master course "A Network Tour of Data Science", edition 2016.
Stars: ✭ 96 (+464.71%)
Mutual labels:  graphs
silky-charts
A silky smooth D3/React library
Stars: ✭ 38 (+123.53%)
Mutual labels:  graphs
VIATRA-Generator
An efficient graph solver for generating well-formed models
Stars: ✭ 21 (+23.53%)
Mutual labels:  graphs
vf3lib
VF3 Algorithm - The fastest algorithm to solve subgraph isomorphism on large and dense graphs
Stars: ✭ 58 (+241.18%)
Mutual labels:  graphs
sr graph
A simple, one-file, header-only, C++ utility for graphs, curves and histograms.
Stars: ✭ 67 (+294.12%)
Mutual labels:  graphs
flownetwork
A python package for flow network analysis
Stars: ✭ 22 (+29.41%)
Mutual labels:  networkx
lovelace-plotly-graph-card
Highly customisable Lovelace card to display interactive graphs. Brings scrolling, zooming, and much more!
Stars: ✭ 38 (+123.53%)
Mutual labels:  graphs

disparity_filter

Implements a disparity filter in Python, using graphs in NetworkX, based on multiscale backbone networks:

"Extracting the multiscale backbone of complex weighted networks"
M. Ángeles Serrano, Marián Boguña, Alessandro Vespignani
https://arxiv.org/pdf/0904.2389.pdf

The disparity filter exploits local heterogeneity and local correlations among weights to extract the network backbone by considering the relevant edges at all the scales present in the system. The methodology preserves an edge whenever its intensity is a statistically not compatible with respect to a null hypothesis of uniform randomness for at least one of the two nodes the edge is incident to, which ensures that small nodes in terms of strength are not neglected. As result, the disparity filter reduces the number of edges in the original network significantly keeping, at the same time, almost all the weight and a large fraction of nodes. As well, this filter preserves the cut-off of the degree distribution, the form of the weight distribution, and the clustering coefficient.

This project is similar to, albeit providing different features than:

Implementation Details

If you are new to multiscale backbone analysis, think of this as analogous to centrality calculated on the edges of a graph rather than its nodes. In other words, consider this as a "dual" of the problem typically faced in social networks. By managing cuts through a process of iterating between measures of centrality and disparity respectively, one can scale a large, noisy graph into something more amenable for work with ontology -- especially as a way to clean up input for neural networks.

The code expects each node to have a required label attribute, which is a string unique within all of the nodes in the graph. Each edge is expected to have a weight attribute, a decimal in the range of [0.0, 1.0] which represents the relative weight of that edge's relationship.

After calculating the disparity metrics, each node get assigned a strength attribute, which is the sum of its outbound edges' weights. Each edge gets assigned the following attributes:

  • norm_weight: ratio of the edge[weight]/source_node[strength]
  • alpha: disparity alpha metric
  • alpha_ptile: percentile for alpha, compared across the graph

One important distinction is that this implementation comes from work in NLP and ontology, where graphs tend to become relatively "noisy" and there are many graphs generated through automation which need to be filtered. NLP applications had tended to reuse graph techniques from social graph analysis, such as connected components, centrality, cuts based on the relative degree of nodes -- while applications which combine NLP plus ontology tend to need information based on the edges.

In particular, this implementation focuses on directed graphs, and uses quantile analysis to adjust graph cuts. The original paper showed how to make cuts using the raw alpha values, which depended on manual (human) decisions. However, that is less than ideal for applications in machine learning, where more automation is typically required. Use of quantiles allows for a form of "normalization" for threshold values, so that cuts can be performed more consistently when automated.

This implementation also integrates support for working with neighborhood attention sets (NES) and other mechanisms for working with semantics and ontologies.

Getting Started

python3 -m venv venv
source venv/bin/activate

python3 -m pip install -U pip wheel
python3 -m pip install -r requirements.txt

Example

The running default main() function:

python3 disparity.py

This will:

  1. generate a random graph (using a seed) of 100 nodes, each with < 10 edges
  2. calculate the significance (alpha) for the disparity filter
  3. calculate quantiles for alpha
  4. cut edges below the 50th percentile (median) for alpha
  5. cut nodes with degree < 2
graph: 100 nodes 489 edges

	ptile	alpha
	0.00	0.0000
	0.10	0.0305
	0.20	0.0624
	0.30	0.1027
	0.40	0.1512
	0.50	0.2159
	0.60	0.3222
	0.70	0.4821
	0.80	0.7102
	0.90	0.9998

filter: percentile 0.50, min alpha 0.2159, min degree 2

graph: 89 nodes 235 edges

In practice, adjust those thresholds as needed before making a cut on a graph. This mechanism provides a "dial" to adjust the scale of the multiscale backbone of the graph.

Contributors

Please use the Issues section to ask questions or report any problems.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].