All Projects → GraphIt-DSL → Graphit

GraphIt-DSL / Graphit

Licence: other
GraphIt - A High-Performance Domain Specific Language for Graph Analytics

Projects that are alternatives of or similar to Graphit

opensbli
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Stars: ✭ 56 (-77.95%)
Mutual labels:  parallel-computing, high-performance-computing, code-generation
Awesome Tensor Compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
Stars: ✭ 490 (+92.91%)
Mutual labels:  compiler, code-generation, high-performance-computing
Tiramisu
A polyhedral compiler for expressing fast and portable data parallel algorithms
Stars: ✭ 685 (+169.69%)
Mutual labels:  compiler, code-generation
Idris Elixir
A code-generator for Idris that targets Elixir
Stars: ✭ 56 (-77.95%)
Mutual labels:  compiler, code-generation
Awesome Machine Learning In Compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Stars: ✭ 168 (-33.86%)
Mutual labels:  compiler, parallel-computing
Mini C
Dr Strangehack, or: how to write a self-hosting C compiler in 10 hours
Stars: ✭ 372 (+46.46%)
Mutual labels:  compiler, code-generation
Turboscript
Super charged typed JavaScript dialect for parallel programming which compiles to WebAssembly
Stars: ✭ 487 (+91.73%)
Mutual labels:  compiler, parallel-computing
Reading
A list of computer-science readings I recommend
Stars: ✭ 1,919 (+655.51%)
Mutual labels:  compiler, parallel-computing
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-40.16%)
Mutual labels:  parallel-computing, high-performance-computing
linnea
Linnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.
Stars: ✭ 60 (-76.38%)
Mutual labels:  high-performance-computing, code-generation
t8code
Parallel algorithms and data structures for tree-based AMR with arbitrary element shapes.
Stars: ✭ 37 (-85.43%)
Mutual labels:  parallel-computing, high-performance-computing
ParallelUtilities.jl
Fast and easy parallel mapreduce on HPC clusters
Stars: ✭ 28 (-88.98%)
Mutual labels:  parallel-computing, high-performance-computing
Rascal
The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)
Stars: ✭ 284 (+11.81%)
Mutual labels:  compiler, code-generation
Feelpp
💎 Feel++: Finite Element Embedded Language and Library in C++
Stars: ✭ 229 (-9.84%)
Mutual labels:  parallel-computing, high-performance-computing
Sundials
SUNDIALS is a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. This is a mirror of current releases, and development will move here eventually. Pull requests are welcome for bug fixes and minor changes.
Stars: ✭ 194 (-23.62%)
Mutual labels:  parallel-computing, high-performance-computing
Fcc
Fedjmike's C Compiler
Stars: ✭ 101 (-60.24%)
Mutual labels:  compiler, code-generation
Dash
DASH, the C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science
Stars: ✭ 134 (-47.24%)
Mutual labels:  parallel-computing, high-performance-computing
Opencoarrays
A parallel application binary interface for Fortran 2018 compilers.
Stars: ✭ 151 (-40.55%)
Mutual labels:  parallel-computing, high-performance-computing
course
高性能并行编程与优化 - 课件
Stars: ✭ 1,610 (+533.86%)
Mutual labels:  parallel-computing, high-performance-computing
PSyclone
Domain-specific compiler for Finite Difference/Volume/Element Earth-system models in Fortran
Stars: ✭ 67 (-73.62%)
Mutual labels:  parallel-computing, high-performance-computing

GraphIt Domain Specific Language and Compiler Build Status

GraphIt is a high-performance Graph DSL. Our website has more detailed tutorials and documentations for the language.

Dependencies

To build GraphIt you need to install CMake 3.5.0 or greater. This dependency alone will allow you to build GraphIt and generate high-performance C++ implementations. Currently, we support both Python 2.7 and Python 3 for the end-to-end tests.

To compile the generated C++ implementations with support for parallelism, you need CILK and OPENMP. One easy way to set up both CILK and OPENMP is to use intel parallel compiler (icpc). The compiler is free for students. There are also open source CILK (g++ >= 5.3.0 with support for Cilk Plus), and OPENMP implementations.

(Optional) To use NUMA optimizations on multi-socket machines, libnuma needs to be installed (on Ubuntu, sudo apt-get install libnuma-dev). We do note, a good number of optimized implementations do not require enabling NUMA optimizations. You can give GraphIt a try even if you do not have libnuma installed.

(Optional) To use the python bindings for GraphIt, you need to install the following packages -

  • python3 (version >= 3.5)
  • scipy (can be installed using pip3)
  • pybind11 (can be installed using pip3)

If you are a mac user who recently upgraded to macOS Mojave, and are having issues with unable to find header files "string.h" or "wchar.h" when using cmake, c++ compiler, or the python scripts that uses the c++ compilers, maybe this post will help. As always, let us know if you have any issues with building and using GraphIt.

Build Graphit

To perform an out-of-tree build of Graphit do:

After you have cloned the directory:

    cd graphit
    mkdir build
    cd build
    cmake ..
    make

Currently, we do require the build directory to be in the root project directory for some unit tests to work. To run the C++ test suite do (all tests should pass):

    cd build/bin
    ./graphit_test

To run the Python end-to-end test suite:

Start from the root directory and change to the build directory

(All tests would pass, but some would generate error messages from the g++ compiler. This is expected.) The project tests should support both Python 2.x and Python 3.x.

    cd build
    python python_tests/test.py
    python python_tests/test_with_schedules.py

(Optional) To test the python bindings, the following extra commands can be run from the GraphIt root directory. You do NOT need this for compiling and running stand alone regular GraphIt programs.

    cd build
    export PYTHONPATH=.
    python3 python_tests/pybind_test.py

When running test_with_schedules.py, commands used for compiling GraphIt files, compiling the generated C++ file, and running the compiled binary file are printed. You can reproduce each test and examine the generated C++ files by typing the printed commands in the shell (make sure you are in the build/bin directory). You can also selectively enable a specific test using the TestSuite commands. We provide examples of enabling a subset of Python tests in the comments of the main function in test_with_schedules.py.

Note when running test.py, some error message may be printed during the run that are expected. We have expected to fail tests that print certain error messages. Please check the final output. test_with_schedules.py might take a few minutes to run.

Compile GraphIt Programs

GraphIt compiler currently generates a C++ output file from the .gt input GraphIt programs. To compile an input GraphIt file with schedules in the same file (assuming the build directory is in the root project directory).

    cd build/bin
    python graphitc.py -f ../../test/input_with_schedules/pagerank_benchmark.gt -o test.cpp

To compile an input algorithm file and another separate schedule file (some of the test files have hardcoded paths to test inputs, be sure to modify that or change the directory you run the compiled files)

The example below compiles the algorithm file (../../test/input/pagerank.gt), with a separate schedule file (../../test/input_with_schedules/pagerank_pull_parallel.gt)

   cd build/bin
   python graphitc.py -a ../../test/input/pagerank_with_filename_arg.gt -f ../../test/input_with_schedules/pagerank_pull_parallel.gt -o test.cpp

Compile and Run Generated C++ Programs

To compile a serial version, you can use reguar g++ with support of c++14 standard to compile the generated C++ file (assuming it is named test.cpp).

    # assuming you are still in the bin directory under build/bin. If not, just do cd build/bin from the root of the directory
    g++ -std=c++14 -I ../../src/runtime_lib/ -O3 test.cpp  -o test
    ./test ../../test/graphs/4.el

To compile a parallel version of the c++ program, you will need both CILK and OPENMP. OPENMP is required for programs using NUMA optimized schedule (configApplyNUMA enabled) and static parallel optimizations (static-vertex-parallel option in configApplyParallelization). All other programs can be compiled with CILK. For analyzing large graphs (e.g., twitter, friendster, webgraph) on NUMA machines, numacl -i all improves the parallel performance. For smaller graphs, such as LiveJournal and Road graphs, not using numactl can be faster.

    # assuming you are still in the bin directory under build/bin. If not, just do cd build/bin from the root of the directory

    # compile and run with CILK
      # icpc
      icpc -std=c++14 -I ../../src/runtime_lib/ -DCILK -O3 test.cpp -o test

      # g++ (gcc) with cilk support
      g++ -std=c++14 -I ../../src/runtime_lib/ -DCILK -fcilkplus -lcilkrts -O3 test.cpp -o test

      # to run the compiled binary on a small test graph, 4.el
      numactl -i all ./test ../../test/graphs/4.el

    # compile and run with OPENMP
      # icpc
      icpc -std=c++14 -I ../../src/runtime_lib/ -DOPENMP -qopenmp -O3 test.cpp -o test

      # g++ (gcc) with openmp support
      g++ -std=c++14 -I ../../src/runtime_lib/ -DOPENMP -fopenmp -O3 test.cpp -o test

      # to run the compiled binary on a small test graph, 4.el
      numactl -i all ./test ../../test/graphs/4.el

    # compile and run with NUMA optimizations (only works with OPENMP and needs libnuma).
      # Sometimes -lnuma will have to come after the test.cpp file
      # icpc
      icpc -std=c++14 -I ../../src/runtime_lib/ -DOPENMP -DNUMA -qopenmp  -O3 test.cpp -lnuma -o test

      # g++ (gcc)
      g++ -std=c++14 -I ../../src/runtime_lib/ -DOPENMP -DNUMA -fopenmp -O3 test.cpp -lnuma -o test

      # to run with NUMA enabled on a small test graph, 4.el
      OMP_PLACES=sockets ./test ../../test/graphs/4.el

You should see some running times printed. The pagerank example files require a command-line argument for the input graph file. If you see a segfault, then it probably means you did not specify an input graph.

Evaluate GraphIt's Performance

The algorithms we used for benchmarking, such as PageRank, PageRankDelta, BFS, Connected Components, Single Source Shortest Paths and Collaborative Filtering are in the apps directory. These files include ONLY the algorithm and NO schedules. You need to use the appropriate schedules for the specific algorithm and input graph to get the best performance.

Detailed instructions for replicating the OOPSLA 2018 GraphIt paper performance is here. In the OOPSLA paper (Table 8), we described the schedules used for each algorithm on each graph on a dual socket system with Intel Xeon E5-2695 v3 CPUs with 12 cores each for a total of 24 cores and 48 hyper-threads. The system has 128GB of DDR3-1600 memory and 30 MB last level cache on each socket, and runs with Transparent Huge Pages (THP) enabled. The best schedule for a different machine can be different. You might need to try a few different set of schedules for the best performance.

Detailed instructions for replicating the results in our CGO 2020 Optimizing Ordered Graph Algorithms with GraphIt paper is here. We collected the results from the same hardware platform as our OOPSLA 18 paper.

If you just want to replicate the performance of our bucket fusion optimization for SSSP with delta stepping proposed in the CGO 2020 paper, you can also just use the GAP benchmark suite. We have since integrated our optimization into the GAP benchmark suite. You can find a bucket fusion optimized SSSP in the GAP repo here.

A Note on the Performance of SSSP We used weight 1 for all the weighted graphs in the original OOPSLA 2018 paper for convenience. We later realized that this decision impacted the performance of SSSP, especially on the road networks. Since then, we have used random weights for the social networks and original weights for the road networks in the CGO 2020 paper. Please refer to our CGO 2020 paper for the performance of SSSP with BellmanFord and DeltaStepping. We have also improved the performance of SSSP on road networks significantly with our new bucket fusion optimization. Please see this closed issue for more details.

In the schedules shown in Table 8 of the OOPSLA paper, the keyword ’Program’ and the continuation symbol ’->’ are omitted. ’ca’ is the abbreviation for ’configApply’. Note that configApplyNumSSG uses an integer parameter (X) which is dependent on the graph size and the cache size of a system. For example, the complete schedule used for CC on Twitter graph is the following (X is tuned to the cache size)

schedule:
    program->configApplyDirection("s1", "SparsePush-DensePull")->configApplyParallelization("s1", "dynamic-vertex-parallel")->configApplyDenseVertexSet("s1","bitvector", "src-vertexset", "DensePull");
    program->configApplyNumSSG("s1", "fixed-vertex-count",  X, "DensePull");

The test/input and test/input_with_schedules directories contain many examples of the algorithm and schedule files. Use them as references when writing your own schedule.

We provide more detailed instructions on evaluating the code generation and performance capability of GraphIt in graphit/graphit_eval/GraphIt_Evaluation_Guide.md. In the guide, we provide instructions for using a series of scripts that make it easier for people to evaluate GraphIt.

Input Graph Formats

GraphIt reuses GAPBS input formats. Specifically, we have tested with edge list file (.el), weighted edge list file (.wel), binary edge list (.sg), and weighted binary edge list (.wsg) formats. Users can use the converters in GAPBS (GAPBS/src/converter.cc) to convert other graph formats into the supported formats, or convert weighted and unweighted edge list files into their respective binary formats. We have provided sample input graph files in the graphit/test/graphs/ directory. The python tests use the sample input files.

Autotuning GraphIt Schedules

Please refer to README.md in graphit/autotune for more details. The autotuner is still somewhat experimental. Please read the instructions carefully before trying it out.

Publications

  • GraphIt-A High-Performance DSL for Graph Analytics OOPSLA 2018
  • Optimizing ordered graph algorithms with GraphIt CGO 2020
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].