All Projects → tud-zih-energy → lo2s

tud-zih-energy / lo2s

Licence: GPL-3.0 License
Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool

Programming Languages

C++
36643 projects - #6 most used programming language
CMake
9771 projects

Projects that are alternatives of or similar to lo2s

perf
Linux Perf subsystem bindings for Go
Stars: ✭ 19 (-20.83%)
Mutual labels:  profiling, cpu-profiling, linux-perf-bindings
Scalene
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python
Stars: ✭ 4,819 (+19979.17%)
Mutual labels:  profiling, cpu-profiling
Tracereader
android小工具,通过读取trace文件,回溯整个整个程序执行调用树。
Stars: ✭ 311 (+1195.83%)
Mutual labels:  trace, profiling
Myperf4j
High performance Java APM. Powered by ASM. Try it. Test it. If you feel its better, use it.
Stars: ✭ 2,281 (+9404.17%)
Mutual labels:  profiling, monitoring-tool
Wtrace
Command line tracing tool for Windows, based on ETW.
Stars: ✭ 563 (+2245.83%)
Mutual labels:  trace, profiling
Profilinggo
A quick tour (or reminder) of Go performance tools
Stars: ✭ 219 (+812.5%)
Mutual labels:  trace, profiling
Hotspot
The Linux perf GUI for performance analysis.
Stars: ✭ 2,415 (+9962.5%)
Mutual labels:  profiling, cpu-profiling
terabit-network-stack-profiling
Tools for profiling the Linux network stack.
Stars: ✭ 68 (+183.33%)
Mutual labels:  kernel, profiling
Traceshark
This is a tool for Linux kernel ftrace and perf events visualization
Stars: ✭ 63 (+162.5%)
Mutual labels:  kernel, trace
profiler
Continuous profiling based on pprof
Stars: ✭ 221 (+820.83%)
Mutual labels:  trace, profiling
srcinv
source code audit tool
Stars: ✭ 45 (+87.5%)
Mutual labels:  kernel
pewmethods
Pew Research Center Methods team R package of miscellaneous functions
Stars: ✭ 121 (+404.17%)
Mutual labels:  sampling
nightingale
A small operating system where I experiment and learn osdev.
Stars: ✭ 86 (+258.33%)
Mutual labels:  kernel
dawgmon
dawg the hallway monitor - monitor operating system changes and analyze introduced attack surface when installing software
Stars: ✭ 52 (+116.67%)
Mutual labels:  monitoring-tool
xbox kernel test suite
Xbox kernel APIs tester written using nxdk
Stars: ✭ 17 (-29.17%)
Mutual labels:  kernel
printer
A fancy logger yet lightweight, and configurable. 🖨
Stars: ✭ 65 (+170.83%)
Mutual labels:  trace
btfhub
BTFHub, together with BTFHub Archive repository, provides BTF files for existing published kernels that don't support embedded BTF.
Stars: ✭ 100 (+316.67%)
Mutual labels:  kernel
novusk
A kernel written in Rust
Stars: ✭ 61 (+154.17%)
Mutual labels:  kernel
slabdbg
GDB plug-in that helps exploiting the Linux kernel's SLUB allocator
Stars: ✭ 55 (+129.17%)
Mutual labels:  kernel
kernel
My ongoing experimentation on operating system internals, aiming at providing a kernel to the FreeDOS-32 project.
Stars: ✭ 23 (-4.17%)
Mutual labels:  kernel

Build

lo2s - Lightweight Node-Level Performance Monitoring

lo2s creates parallel OTF2 traces with a focus on both application and system view. The traces can contain any of the following information:

  • From running threads
    • Calling context samples based on instruction overflows
    • The calling context samples are annotated with the disassembled assembler instruction string
    • The framepointer-based call-path for each calling context sample
    • Per-thread performance counter readings
    • Which thread was scheduled on which CPU at what time
  • From the system
    • Metrics from tracepoints (e.g. the selected C-state or P-state)
    • The node-level system tree (cpus (HW-threads), cores, packages)
    • CPU power measurements (x86_energy)
    • Microarchitecture specific metrics (x86_adapt, per package or per core)
    • Arbitrary metrics through plugins (Score-P compatible)

In general lo2s operates either in process monitoring or system monitoring mode.

With process monitoring, all information is grouped by each thread of a monitored process group - it shows you on which CPU is each monitored thread running. lo2s either acts as a prefix command to run the process (and also tracks its children), or lo2s attaches to a running process.

In the system monitoring mode, information is grouped by logical CPU - it shows you which thread was running on a given CPU. Metrics are also shown per CPU.

In both modes, system-level metrics (e.g. tracepoints), are always grouped by their respective system hardware component.

Build Requirements

  • Linux1
  • OTF2 (>= 2.2)
  • libbfd
  • libiberty
  • CMake (>= 3.11)
  • A C++ Compiler with C++17 support and the std::filesystem library (GCC > 7, Clang > 5)

1: Use Linux >= 4.1 for best results. Older versions, even the ancient 2.6.32, will work, but with degraded time synchronization.

Optional Build Dependencies

  • x86_adapt for mircorarchitecture specific metrics
  • x86_energy for CPU power metrics
  • libradare for disassembled instruction strings

Runtime Requirements

  • kernel.perf_event_paranoid should be less than or equal to 1 for process monitoring mode and less than or equal to 0 in system monitoring mode. A value of -1 will give the most features for non-root performance recording, at the cost of some security. Modify as follows:

    sudo sysctl kernel.perf_event_paranoid=1

  • Tracepoints and system-wide monitoring on kernels older than 4.3 requires access to debugfs. Grant permissions at your own discretion.

    sudo mount -t debugfs none /sys/kernel/debug

Installation

  • It is recommended to create an empty build directory anywhere.
  • cmake /path/to/lo2s
  • Configure cmake as usual, e.g. with ccmake .
  • make
  • make install

Usage

To monitor a given application in process monitoring execute

  • lo2s -- ./a.out --app-args

To monitor all activity on a system run

  • lo2s -a (stop the recording with ctrl+c)

Usage with MPI

You can record simple traces from MPI programs, but lo2s does not record MPI communication. To create fully-featured MPI-aware traces, use Score-P.

  • lo2s mpirun ./a.out Create one trace of mpirun, useful if mpirun is used locally on one node.
  • mpirun lo2s ./a.out Creates a separate trace for each process.

See man lo2s or lo2s --help for a full listing of options and usage.

Quirks

The perf_event_open kernel infrastructure changed significantly over time. Therefore, it is already hard to just keep track which kernel version introduced which new feature. Combine that with the abundance of backports of particular features by different distributors, and you end with a mess of options.

In the effort to keep compatible with older kernels, several quirks have been added to lo2s:

  1. The initial time synchronization between lo2s and the kernel-space perf is done with a hardware breakpoint. If your kernel doesn't support that, you can disable it using the CMake option USE_HW_BREAKPOINT_COMPAT.
  2. The used clock source for the kernel-space time measurments can be changed, however if you kernel doesn't support that, you can disable it with the CMake option USE_PERF_CLOCKID.
  3. If you get the following error message: event 'ref-cycles' is not available as a metric leader!, you can fallback to the bus-cycles metric as leader using --metric-leader bus-cycles.

Working with traces

Traces can be visualized with Vampir. You can use OTF2 or any of its tools. Native interfaces are available for C and Python

Acknowledgements

This work is supported by the German Research Foundation (DFG) within the CRC 912 - HAEC.

Primary Reference

A description and use cases can be found in the following paper. Please cite this if you use lo2s for scientific work.

Thomas Ilsche, Robert Schöne, Mario Bielert, Andreas Gocht and Daniel Hackenberg. lo2s – Multi-Core System and Application Performance Analysis for Linux 📕 In: Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA). 2017. DOI: 10.1109/CLUSTER.2017.116

Additional References

Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert and Daniel Hackenberg: Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems 📕 In: 5th Workshop on Runtime and Operating Systems for the Many-core Era (ROME). 2017, DOI: 10.1007/978-3-319-75178-8_50

Thomas Ilsche, Robert Schöne, Philipp Joram, Mario Bielert and Andreas Gocht: "System Monitoring with lo2s: Power and Runtime Impact of C-State Transitions" In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI: 10.1109/IPDPSW.2018.00114

Name

The name lo2s is an acronym for Linux OTF2 Sampling

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].