All Projects → lqhl → PowerWalk

lqhl / PowerWalk

Licence: other
Personalized PageRank (PPR) on GraphLab PowerGraph

Projects that are alternatives of or similar to PowerWalk

spotify-song-recommender
A Spotify song recommendation engine built with the power of graph analytics.
Stars: ✭ 34 (+142.86%)
Mutual labels:  graph-algorithms
NGCF-PyTorch
PyTorch Implementation for Neural Graph Collaborative Filtering
Stars: ✭ 200 (+1328.57%)
Mutual labels:  graph-algorithms
graphs
Graph algorithms written in Go
Stars: ✭ 60 (+328.57%)
Mutual labels:  graph-algorithms
InterviewPrep
A repository containing link of good interview questions
Stars: ✭ 54 (+285.71%)
Mutual labels:  graph-algorithms
HuaweiCodeCraft2020
2020华为软件精英挑战赛
Stars: ✭ 14 (+0%)
Mutual labels:  graph-algorithms
Grafatko
An app for creating and visualizing graphs and graph-related algorithms.
Stars: ✭ 22 (+57.14%)
Mutual labels:  graph-algorithms
Graph Data Science
Source code for the Neo4j Graph Data Science library of graph algorithms.
Stars: ✭ 251 (+1692.86%)
Mutual labels:  graph-algorithms
PathFinder-Visualization
📟 React and p5, maze generation and path finding visualization
Stars: ✭ 12 (-14.29%)
Mutual labels:  graph-algorithms
Advanced-Shortest-Paths-Algorithms
Java Code for Contraction Hierarchies Algorithm, A-Star Algorithm and Bidirectional Dijkstra Algorithm. Tested and Verified Code.
Stars: ✭ 63 (+350%)
Mutual labels:  graph-algorithms
PGD
A Parallel Graphlet Decomposition Library for Large Graphs
Stars: ✭ 68 (+385.71%)
Mutual labels:  graph-algorithms
RioGNN
Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks
Stars: ✭ 46 (+228.57%)
Mutual labels:  graph-algorithms
blossom
Edmonds's blossom algorithm for maximum weight matching in undirected graphs
Stars: ✭ 16 (+14.29%)
Mutual labels:  graph-algorithms
pathfinding-visualizer
Website built using React Framework for visualizing Pathfinding and Maze Generation Algorithms.
Stars: ✭ 33 (+135.71%)
Mutual labels:  random-walk
nodegraph
NodeGraph - A simple directed graph with visualization UI.
Stars: ✭ 21 (+50%)
Mutual labels:  graph-algorithms
dingo
A python library for metabolic networks sampling and analysis
Stars: ✭ 29 (+107.14%)
Mutual labels:  random-walk
Pygraphblas
GraphBLAS for Python
Stars: ✭ 252 (+1700%)
Mutual labels:  graph-algorithms
edgebundle
R package implementing edge bundling algorithms
Stars: ✭ 100 (+614.29%)
Mutual labels:  graph-algorithms
jsgraph
Deprecated: Use the @encapsule/arccore package that includes the graph library
Stars: ✭ 42 (+200%)
Mutual labels:  graph-algorithms
DGFraud-TF2
A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X
Stars: ✭ 84 (+500%)
Mutual labels:  graph-algorithms
rustgraphblas
rust-library to wrap GraphBLAS.h
Stars: ✭ 23 (+64.29%)
Mutual labels:  graph-algorithms

PowerWalk

Most methods for Personalized PageRank (PPR) precompute and store all accurate PPR vectors, and at query time, return the ones of interest directly. However, the storage and computation of all accurate PPR vectors can be prohibitive for large graphs, especially in caching them in memory for real-time online querying. We propose a distributed framework, PowerWalk, that strikes a better balance between offline indexing and online querying. The offline indexing attains a fingerprint of the PPR vector of each vertex by performing billions of "short" random walks in parallel across a cluster of machines. We prove that our indexing method has an exponential convergence, achieving the same precision with previous methods using a much smaller number of random walks. At query time, the new PPR vector is composed by a linear combination of related fingerprints, in a highly efficient vertex-centric decomposition manner. Interestingly, the resulting PPR vector is much more accurate than its offline counterpart because it actually uses more random walks in its estimation. More importantly, we show that such decomposition for a batch of queries can be very efficiently processed using a shared decomposition. Our implementation takes advantage of advanced distributed graph engines and it outperforms the state-of-the-art algorithms by orders of magnitude. Particularly, it responses to tens of thousands of queries on graphs with billions of edges in just a few seconds.

For the detailed description of our algorithm please refer to our paper on CIKM'16:

PowerWalk: Scalable Personalized PageRank via Random Walks with Vertex-Centric Decomposition
Qin Liu, Zhenguo Li, John C.S. Lui, Jiefeng Cheng.
The 25th ACM International Conference on Information and Knowledge Management (CIKM), 2016

This repository contains our implementation of the online querying algorithm built atop a modified version of GraphLab PowerGraph. Our code is placed in apps/ms-ppr.

Our implementation of offline indexing described in our paper is built on VENUS which is a disk-based graph processing system developed by me with Huawei Noah's Ark Lab. Unfortunately, VENUS cannot not be open sourced. In this repository, we include an implementation of our offline indexing algorithm atop PowerGraph. However, the offline indexing on PowerGraph is much slower than the version on VENUS. To solve this problem, I plan to provide a binary version, or migrate my program to GraphChi in future.

Compiling

For the detailed compiling process please refer to PowerGraph's readme file. Here is a simple example:

./configure
cd release/apps/ms-ppr/
make -j4

Running

Input file edges.txt:

1 2  # an edge from vertex 1 to 2
3 4  # an edge from vertex 3 to 4

Input file sources.txt:

3  # number of querying vertices
1
2
4

Online query without offline indexing:

./query-flow-v2 --graph /path/to/edges.txt \  # edge list
    --threshold 0.00001 --niters 8 \          # threshold and number of iterations
    --sources_file /path/to/sources.txt \     # querying vertices
    --num_sources 2 \                         # number of querying vertices (can be smaller than the one in the input file)
    --no_index true                           # run without offline index

Offline indexing:

./rw-fullpath --graph /path/to/edges.txt \  # edge list
    --niters 10 \                           # number of iterations
    --R 1 \                                 # number of random walks from each vertex
    --bin_prefix /path/to/ppr.index         # path to output graph and index

Online query with offline index:

./query-flow-v2 --graph /path/to/ppr.index \  # path to graph and index
    --sources_file /path/to/sources.txt \     # querying vertices
    --threshold 0.00001 --niters 6 \          # threshold and number of iterations
    --num_sources 2                           # number of querying vertices (can be smaller than the one in the input file)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].