All Projects → dsg-uwaterloo → graphsurge

dsg-uwaterloo / graphsurge

Licence: Apache-2.0 license
Graphs analytics on collections of views!

Programming Languages

rust
11053 projects
shell
77523 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to graphsurge

overview
🤖 A collection of tools to make your Angular views more modular, scalable, and maintainable
Stars: ✭ 76 (+245.45%)
Mutual labels:  views
SuiteSparseGraphBLAS.jl
Sparse, General Linear Algebra for Graphs!
Stars: ✭ 79 (+259.09%)
Mutual labels:  graph-analytics
tododjangoccb
A todo application with django web framework with class based views and ajax modal crud
Stars: ✭ 32 (+45.45%)
Mutual labels:  views
Django Braces
Reusable, generic mixins for Django
Stars: ✭ 1,756 (+7881.82%)
Mutual labels:  views
yii2-render-many
Trait for Yii Framework 2
Stars: ✭ 14 (-36.36%)
Mutual labels:  views
gardenia
GARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Stars: ✭ 22 (+0%)
Mutual labels:  graph-analytics
django-speedinfo
Django views profiler for small projects
Stars: ✭ 55 (+150%)
Mutual labels:  views
pgx-samples
Applications using Parallel Graph AnalytiX (PGX) from Oracle Labs
Stars: ✭ 39 (+77.27%)
Mutual labels:  graph-analytics
MySQL-cheatsheet
Cheatsheet for MySQL
Stars: ✭ 43 (+95.45%)
Mutual labels:  views
scarf
Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (+145.45%)
Mutual labels:  graph-analytics
laravel-compile-views
Missing view:compile command for laravel [ABANDONED]
Stars: ✭ 20 (-9.09%)
Mutual labels:  views
SceneryView
🏜 The scenery is unique here~ 自定义View,圆、三角形、云朵,平移及旋转动画。
Stars: ✭ 30 (+36.36%)
Mutual labels:  views
rack-component
Handle HTTP requests with modular, React-style components, in any Rack app
Stars: ✭ 68 (+209.09%)
Mutual labels:  views
Django
The Web framework for perfectionists with deadlines.
Stars: ✭ 61,277 (+278431.82%)
Mutual labels:  views
ci-theme
UPDATED: Now themes are independent from application with the use of Actions and Filters :D .. The branch WP-Like is not available for public usage but you may contact me if you want a copy of it.
Stars: ✭ 20 (-9.09%)
Mutual labels:  views
simpleDjangoProject
simpleDjangoProject
Stars: ✭ 30 (+36.36%)
Mutual labels:  views
Tigr
Transforming Graphs for Efficient Irregular Graph Processing on GPUs
Stars: ✭ 34 (+54.55%)
Mutual labels:  graph-analytics
snabberb
A simple component view framework for Ruby Opal based on Snabbdom
Stars: ✭ 41 (+86.36%)
Mutual labels:  views
GraphScope
🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba 来自阿里巴巴的一站式大规模图计算系统 图分析 图查询 图机器学习
Stars: ✭ 1,899 (+8531.82%)
Mutual labels:  graph-analytics
laravel-counters
Counters Management for laravel project.
Stars: ✭ 43 (+95.45%)
Mutual labels:  views

Graphsurge: Graph Analytics on View Collections using Differential Computations

Continuous integration

Graphsurge is a new system for performing analytical computations on multiple snapshots or views of large-scale static property graphs. Graphsurge allows users to create view collections, a set of related views of a graph created by applying filter predicates on node and edge properties, and run analytical computations on all the views of a collection efficiently.

Graphsurge is built on top of Timely Dataflow and Differential Dataflow, which provides two huge benefits:

  • Differential Dataflow can incrementally maintain the results for any general computation, including cyclic or iterative computations (which include many graph algorithms such as Connected Components. Analytical computations in Graphsurge are expressed using Differential operators and enables reusing computation results across the views of a collection instead of running computations from scratch on each view. This results in huge savings on the total runtime.
  • We use the Timely execution engine to seamlessly scale both the materialization of view collections and running analytical computations to a distributed environment, similar to using other execution frameworks such as Spark or Flink.

Graphsurge stores view collections using a form of delta encoding, where the data for a view GVi represent its difference with the previous view GVi-1. This representation can also be directly used as inputs to Differential Dataflow computations.

In general, the runtime of a black-box differential computation (such as the user-defined computations in Graphsurge) is correlated with the total number of diffs of a view collection. Graphsurge enables 2 key optimizations based on this observation:

  • Collection Ordering: The total number of diffs of a view collection depends on the order the views (similar views placed next to each other will generate less diffs) and we want to reorder a given set of views to get the lowest number of diffs. This Collection Ordering Problem is related to the Consecutive Block Minimization Problem, which is NP-Hard! Graphsurge solves this problem using a constant-factor approximation algorithm (resulting in up to 10x less diffs in our experiments).

  • Adaptive Collection Splitting: Maintaining computation results unsurprisingly implies an overhead for Differential Dataflow, as it needs to check the entire history of a key to determine the effect of a new update. This overhead is especially large for cases where the number of diffs of a view are high, or for computations (like PageRank) which results in a large number of output changes even for small number of input updates. In such cases, it is faster to run the computation on a view from scratch instead of trying to reuse results from previous views.

    Graphsurge keeps track of the correlation between the number of the diffs and the actual computation time when running differentially and also when rerunning from scratch. It uses a linear regression model to adaptively decide at runtime to split the view collection at the point where rerunning from scratch is predicted to be faster than to continue running differentially.

More details on our techniques and experimental results can be found in our paper.

Using Graphsurge

Graphsurge is written in Rust. To run the Graphsurge cli, download and build the binary:

$ git clone https://github.com/dsg-uwaterloo/graphsurge && cd graphsurge
$ cargo build --release
$ ./target/bin/graphsurge

Set the number of worker threads and process id:

graphsurge> SET THREADS 4 AND PROCESS_ID 0;

Load a graph:

graphsurge> LOAD GRAPH WITH
    VERTICES FROM 'data/small_properties/vertices.txt' and
    EDGES FROM 'data/small_properties/edges.txt'
    COMMENT '#';

Create a view collection:

graphsurge> CREATE VIEW COLLECTION Years WHERE
    [year <= 2000 and u.country = 'canada' and v.country = 'canada'],
    [year <= 2005 and u.country = 'canada' and v.country = 'canada'],
    [year <= 2010 and u.country = 'canada' and v.country = 'canada'];

Run computations:

$ mkdir bfs_results
graphsurge> RUN COMPUTATION wcc ON COLLECTION Years SAVE RESULTS TO 'bfs_results';

Running in a distributed environment:

To run Graphsurge on multiple machines, say on 2 hosts server1 and server2, start Graphsurge and set the process ids:

# On server1
graphsurge> SET THREADS 32 AND PROCESS_ID 0;
# On server2
graphsurge> SET THREADS 32 AND PROCESS_ID 1;

Then run the same queries on both of them. Make sure that server1 and server2 can access each other at the specified port.

graphsurge> LOAD GRAPH WITH
    VERTICES FROM 'data/small_properties/vertices.txt' and
    EDGES FROM 'data/small_properties/edges.txt'
    COMMENT '#';
graphsurge> CREATE VIEW COLLECTION Years WHERE
    [year <= 2000 and u.country = 'canada' and v.country = 'canada'],
    [year <= 2005 and u.country = 'canada' and v.country = 'canada'],
    [year <= 2010 and u.country = 'canada' and v.country = 'canada']
    HOSTS 'server1:9000' 'server2:9000';
graphsurge> RUN ARRANGED_DIFFERENTIAL COMPUTATION wcc on COLLECTION Years
    HOSTS 'server1:9000' 'server2:9000';

The same process can be repeated for additional hosts machines.

Writing new computations:

Graphsurge already has implementations for a set of common graph algorithms. New computations can be written using the Analytics Computation API. You can see examples of how to use the API for bfs and scc.

Check the experiments folder for examples on how to use Graphsurge.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].