All Projects → LLNL → ravel

LLNL / ravel

Licence: other
Ravel MPI trace visualization tool

Programming Languages

C++
36643 projects - #6 most used programming language

Projects that are alternatives of or similar to ravel

Hpcinfo
Information about many aspects of high-performance computing. Wiki content moved to ~/docs.
Stars: ✭ 171 (+557.69%)
Mutual labels:  mpi
Batch Shipyard
Simplify HPC and Batch workloads on Azure
Stars: ✭ 240 (+823.08%)
Mutual labels:  mpi
GenomicsDB
Highly performant data storage in C++ for importing, querying and transforming variant data with C/C++/Java/Spark bindings. Used in gatk4.
Stars: ✭ 77 (+196.15%)
Mutual labels:  mpi
Primecount
🚀 Fast prime counting function implementations
Stars: ✭ 193 (+642.31%)
Mutual labels:  mpi
Abyss
🔬 Assemble large genomes using short reads
Stars: ✭ 219 (+742.31%)
Mutual labels:  mpi
azurehpc
This repository provides easy automation scripts for building a HPC environment in Azure. It also includes examples to build e2e environment and run some of the key HPC benchmarks and applications.
Stars: ✭ 102 (+292.31%)
Mutual labels:  mpi
Tomsfastmath
TomsFastMath is a fast public domain, open source, large integer arithmetic library written in portable ISO C.
Stars: ✭ 169 (+550%)
Mutual labels:  mpi
t8code
Parallel algorithms and data structures for tree-based AMR with arbitrary element shapes.
Stars: ✭ 37 (+42.31%)
Mutual labels:  mpi
Dmtcp
DMTCP: Distributed MultiThreaded CheckPointing
Stars: ✭ 229 (+780.77%)
Mutual labels:  mpi
api-spec
API Specififications
Stars: ✭ 30 (+15.38%)
Mutual labels:  mpi
Timemory
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
Stars: ✭ 192 (+638.46%)
Mutual labels:  mpi
Mpi.jl
MPI wrappers for Julia
Stars: ✭ 197 (+657.69%)
Mutual labels:  mpi
hp2p
Heavy Peer To Peer: a MPI based benchmark for network diagnostic
Stars: ✭ 17 (-34.62%)
Mutual labels:  mpi
Mpi Operator
Kubernetes Operator for Allreduce-style Distributed Training
Stars: ✭ 190 (+630.77%)
Mutual labels:  mpi
ParMmg
Distributed parallelization of 3D volume mesh adaptation
Stars: ✭ 19 (-26.92%)
Mutual labels:  mpi
Libgrape Lite
🍇 A C++ library for parallel graph processing 🍇
Stars: ✭ 169 (+550%)
Mutual labels:  mpi
Aff3ct
A fast simulator and a library dedicated to the channel coding.
Stars: ✭ 240 (+823.08%)
Mutual labels:  mpi
Theano-MPI
MPI Parallel framework for training deep learning models built in Theano
Stars: ✭ 55 (+111.54%)
Mutual labels:  mpi
az-hop
The Azure HPC On-Demand Platform provides an HPC Cluster Ready solution
Stars: ✭ 33 (+26.92%)
Mutual labels:  mpi
Foundations of HPC 2021
This repository collects the materials from the course "Foundations of HPC", 2021, at the Data Science and Scientific Computing Department, University of Trieste
Stars: ✭ 22 (-15.38%)
Mutual labels:  mpi

Ravel

Ravel is a trace visualization tool for MPI with recent experimental support for Charm++. Ravel is unique in that it shows not only physical timelines, but also logical ones structured to better capture the intended organization of communication operations. Ravel calculates logical structure from Open Trace Format or Charm++ Projections logs and presents the results using multiple coordinated views.

In logical time, all operations are colored via some metric. The default metric for MPI is lateness which measures the difference in exit time of an operation compared to its peers at the same logical timestep.

Ravel Logical and Physical Timelines

Installation

Ravel depends on:

To install:

$ git clone https://github.com/scalability-llnl/ravel.git
$ mkdir ravel/build
$ cd ravel/build
$ cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/directory ..
$ make
$ make install

If a dependency is not found, add its install directory to the CMAKE_PREFIX_PATH environment variable.

Usage

Opening a Trace

Before opening the trace, check your settings under Options->OTF Importing. These options will affect the logical organization even determined by Ravel. Once you are happy with your options, use File->Open Trace and navigate to your .otf, .otf2, or .sts file.

Partitions

Ravel partitions the trace into fine-grained communication phases -- sets of communication operations that must belong together. It imposes a happened-before ordering between traces to better represent how developers think of them separately.

  • Automatically determine partitions: use happened-before relationships and the following options:
    • use Waitall heuristic: OTF version 1 only, will group all uninterrupted send operations before each Waitall in the same phase
    • merge Partitions by call tree: Will merge communication operations up the call stack until reaching a call containing multiple such operations. MPI only.
    • merge Partitions to complete Leaps: Avoids sparse partitions by forcing each rank to be active at each distance in the phase DAG. Useful for bulk synchronous codes. MPI only.
      • skip Leaps that cannot be merged: Relaxes the leap merge when it cannot find a next merge.
    • merge across global steps: This merge happens after stepping, so it does not affect the logical structure, but groups MPI ranks that cover the same logical step. MPI only.
    • Charm++ break functions: Force the breaking of partitions at the given common-separated list of Charm++ entry methods.
  • Partition at function breaks: Use if you know your phases are contained in a given function. List the function.

Other Options

  • Matching send/recv must have the same message size: Enforces send and receive reporting the same message size. Uncheck this for Scalasca-generated OTF2.
  • Idealized order of receives: Change the order of receives from their true physical time order to an idealized one in each phase. We recommend this for Charm++. It is not compatible with clustering.
  • Advanced stepping within a partition: Align sends based on happened-before structure rather than as early as possible. MPI only.
  • Coalesce Isends: Groups neighboring MPI_Isends into a single operation which may send to multiple receive operations. We recommend this option as a default for all MPI traces.
  • Cluster processes: Shows a cluster view that clusters the processes by the active metric. This is useful for large process counts. MPI only.
    • Seed: Set seed for repeatable clustering.

Navigating Traces

The three timeline views support linked panning, zooming and selection. The overview shows the total metric value over time steps for the whole trace. Clicking and dragging in this view will select a span of timesteps in the other views.

Navigation Control
Pan Left-click drag
Zoom in time Mouse wheel
Zoom in processes Shift + Mouse wheel
Zoom to rectangle Right-click drag rectangle
Select operation Right-click operation
Tool tips Hover

The cluster view has a slider which changes the size of the neighborhood shown in the upper part of the view. The lower part of the view shows the clusters. Left-click to divide clusters into its children. Click on dendrogram nodes to collapse clusters. Dendrogram pertains to left-most visible partition. Clustering currently shows the first partition rather than all.

Saving Traces

All traces are saved in OTF2 and include only the information from the original trace that is used by Ravel. In addition, communication-related operations used for logical structure have an OTF2_AttributeList associated with their Leave events. These lists include a phase and step value defining the logical structure used by Ravel, as well as any metric values computed for that operation. Any metric values ending in _agg represent the calculated value of the aggregated non-communication operation directly preceding.

Authors

Ravel was written by Kate Isaacs.

License

Ravel is released under the LGPL license. For more details see the LICENSE file.

LLNL-CODE-663885

Related Publications

Katherine E. Isaacs, Peer-Timo Bremer, Ilir Jusufi, Todd Gamblin, Abhinav Bhatele, Martin Schulz, and Bernd Hamann. Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time. IEEE Transactions on Visualization and Computer Graphics, Proceedings of InfoVis '14, 20(12):2349-2358, December 2014. DOI: 10.1109/TVCG.2014.2346456

Katherine E. Isaacs, Abhinav Bhatele, Jonathan Lifflander, David Boehme, Todd Gamblin, Bernd Hamann, Peer-Timo Bremer. Recovering Logical Structure from Charm++ Event Traces. In Proceedings fo the ACM/IEEE Conference on Supercomputing (SC15), November 2015. DOI: 10.1145/2807591.2807634

Katherine E. Isaacs, Todd Gamblin, Abhinav Bhatele, Martin Schulz, Bernd Hamann, and Peer-Timo Bremer. Ordering Traces Logically to Identify Lateness in Message Passing Programs. IEEE Transactions on Parallel and Distributed Systems, 27(3):829-840, March 2016. DOI: 10.1109/TPDS.2015.2417531

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].