All Projects → nexB → tracecode-toolkit-strace

nexB / tracecode-toolkit-strace

Licence: other
Trace software components, packages and files between Development/Source and Deployment/Distribution/Binaries codebases - strace build analysis

Programming Languages

python
139335 projects - #7 most used programming language
Roff
2310 projects
Batchfile
5799 projects
shell
77523 projects

Projects that are alternatives of or similar to tracecode-toolkit-strace

pexample
Building and packaging Python with Pants and PEX - an annotated example
Stars: ✭ 21 (+0%)
Mutual labels:  build
netlify-plugin-cache
⚡ Generic plugin for caching any files and/or folders between Netlify builds
Stars: ✭ 19 (-9.52%)
Mutual labels:  build
drevops
💧 + 🐳 + ✓✓✓ + 🤖 + ❤️ Build, Test, Deploy scripts for Drupal using Docker and CI/CD
Stars: ✭ 55 (+161.9%)
Mutual labels:  build
awesome-beam-monitoring
Curated list of awesome BEAM monitoring libraries and resources
Stars: ✭ 57 (+171.43%)
Mutual labels:  tracing
gcloud-opentracing
OpenTracing Tracer implementation for GCloud StackDriver in Go.
Stars: ✭ 44 (+109.52%)
Mutual labels:  tracing
postbuildscript-plugin
The PostBuildScript Jenkins plugin lets you execute a set of scripts at the end of the build depending on the build status.
Stars: ✭ 40 (+90.48%)
Mutual labels:  build
Carbon.Gulp
Carbon/Gulp is a delicious blend of tasks and build tools poured into Gulp to form a full-featured modern asset pipeline for Flow Framework and Neos CMS.
Stars: ✭ 15 (-28.57%)
Mutual labels:  build
airin
A framework for automated migration of your projects to Bazel build system.
Stars: ✭ 21 (+0%)
Mutual labels:  build
stagemonitor-kibana
Kibana-Plugin for stagemonitor trace visualization
Stars: ✭ 13 (-38.1%)
Mutual labels:  tracing
sp-build-tasks
👷 SharePoint front-end projects automation and tasks tool-belt
Stars: ✭ 15 (-28.57%)
Mutual labels:  build
eslint4b
ESLint which works in browsers.
Stars: ✭ 33 (+57.14%)
Mutual labels:  build
zig-header-gen
Automatically generate headers/bindings for other languages from Zig code
Stars: ✭ 40 (+90.48%)
Mutual labels:  build
go-sensor
🚀 Go Distributed Tracing & Metrics Sensor for Instana
Stars: ✭ 90 (+328.57%)
Mutual labels:  tracing
ruby-sensor
💎 Ruby Distributed Tracing & Metrics Sensor for Instana
Stars: ✭ 23 (+9.52%)
Mutual labels:  tracing
simple-targets-csx
⊙ A minimalist target runner for C# scripts.
Stars: ✭ 17 (-19.05%)
Mutual labels:  build
go2sky-plugins
The plugins of go2sky
Stars: ✭ 46 (+119.05%)
Mutual labels:  tracing
blindsight
Blindsight is a Scala logging API with DSL based structured logging, fluent logging, semantic logging, flow logging, and context aware logging.
Stars: ✭ 70 (+233.33%)
Mutual labels:  tracing
freedom-middleware-webpack2
webpack2前端项目开发构建中间件,方便统一管理前端项目中95%以上的构建工作
Stars: ✭ 35 (+66.67%)
Mutual labels:  build
platform-atmelavr
Atmel AVR: development platform for PlatformIO
Stars: ✭ 97 (+361.9%)
Mutual labels:  build
bpfps
A tool to list and diagnose bpf programs. (Who watches the watchers..? :)
Stars: ✭ 93 (+342.86%)
Mutual labels:  tracing

Tracing a build on Linux

TraceCode is a tool to analyze the traced execution of a build, so you can learn which files are built into binaries and ultimately deployed in your distributed software.

1. Tracing a build

See docs/README-build-tracing.rst for tracing a build

2. System requirements and installation

Ensure you have Python 2.7 installed::
python -v

Install it as needed if not installed, ensuring that it is in your path. See you local Linux distributor for details.

Ensure you have Graphviz 2.36+ installed and in your path::
dot -v

Install it as needed if not installed, ensuring that it is in your path. See http://graphviz.org/ for details.

If not installed, you will see ERROR messages and the results are unlikely to be usable.

3. Install TraceCode

Get it from https://github.com/nexb/tracecode-build and unzip it. The path were this is unzipped will be referred to as {tracecode_dir} later in this document.

Then execute this command to setup TraceCode::
./configure
Finally run the built-in selftest to verify your installation::
py.test -vvs tests

4. Install strace

5. Analyze your build

Analyzing a traced build is a multi-stage process that involves:
  • parsing and checking the initial traces,
    • optionally filtering the parsed traces,
    • optionally collecting the inventory of files read and written during the build,
  • creating the list of source (input) and target (output) files for your build,

  • analyzing the build graph to determine the source to target relationships, such as source code files being built into a binary,

    • optionally creating graphical representations to visualize subset of your build graph.

Each of these steps is performed by invoking tracecode from the command line with different options and arguments.

Run the trace analysis with:

tracecode <options> <command> <arguments>

For command help use:

tracecode -h

Tutorial

0. Trace a command

1. Parse the collected raw traces

Create a new empty directory to store parsed traces. Then parse using the "parse" command:

tracecode parse <RAW TRACES DIR INPUT> <PARSED TRACES DIR OUTPUT>

This will parse the traces and ensure they can be processed and are complete

2. Collect the inventory of files processed during the tracing

If traces are consistent the next step is to collect the inventories of files reads and writes. Use the "list" command (which should be called inventory). It creates two files from a parsed trace: a list of files being only read and a list of files being written:

tracecode list <PARSED TRACES DIR INPUT> <READS OUTPUT FILE> <WRITES OUTPUT FILE>

The list command extracts all the paths used in the traces.

3. optional but recommended: Filter your parsed traces

The next step is to review these reads and writes and decide which ones could be filtered out as they may not contribute interesting data to the build graph and the analysis.

This includes typically:

  • /etc/*
  • /proc/*
  • the build log files if any
  • Some standard things in /usr/* and similar

For this you build a list of reads to ignore and writes to ignore (usually patterns or plain lists) you stuffs these two lists in a two files and use the filter command to filter out these reads and writes.

Beware of not filtering too much: temp files in /tmp you want to keep certain makedepend (.po, etc) files you may not care for.

When you filter at first filter to a new directory so taht you do not replace the original full parsed traces yet, so you can get comfy and refine your filtering.

Create a file that contains one line for each read or write you want to filter out or prune from the trace Either a full path as found in the reads or writes list, or a pattern as in /etc/* in which case everything matching /etc/* would be filtered out like when you use glob patterns on the command line Use oe path or pattern per line in a file. Note that it can be a single column csv alright too.

4. optional: Guess sources and targets

You can use the "guess" command to guess sources and targets, but that is just a guess. Guessing works ok on small well defined simple codebases, but might noy likely be good on larger ones.

The guess goes this way:
  • files that are only ever read from are likely the source/devel
  • files that are only ever written to read are likely the target/deployed

5. Assemble the inventory of sources an targets

Once you have filtered your parsed trace, you need to create a list of files that are your sources, origin development files and another list that are your targets, deployed files. You need to build theses inventories each in a separate file. You can try the guess command, but that is just a wild guess based on the graph. The paths should have exactly the same structure as in the "list" output. The sources and targets files should be among the reads and writes, so you can use these lists as an input. Alternatively you can use keep an output of the find command before your tracing (your sources) and after and diff it to find what would be the candidates.

Use these lists again to build new lists to define what is the list of devel/sources files and what is the list of deployed/targets files.

6. Analyze sources to targets transformations

Then you can run either the analyze command to get the source to target deployment analysis.

7. optional: Graph select subset of sources to targets transformations

You can selectively create a graphic tracing the transformation from several sources to a one target or several targets to one sources with graphics (selectively because this takes long time to run and large graphics are impossible to visualize)

FAQ:

Q: When parsing raw traces I am getting this error:

ERROR:tracecode:INCOMPLETE TRACE, 149249 orphaned trace(s) detected. First pid is: 3145728.

A: This is a serious error and means that your trace is not coherent as some process traces could not be related to the initial command launch graph and are therefore unrelated. This can happen if you mistakenly trace several commands and store the strace output in the same directory. You need to recollect your traces starting with a clean empty directory.

Q: When parsing raw traces I am getting several warnings:

WARNING:tracecode:parse_line: Unable to decode descriptor for pid: 3097012, line: '1399882436.807573 dup2(5</extra/linux-2.6.32/scripts/mksysmap>, 255) = 255\n'

A: This is just a warning that you can ignore most of the times. Here a file descriptor 255 does not (and cannot) exist, hence the warning.

Credits and related tools

This implementation of an strace-based build tracer is essentially an implementation of this papers:

Sander van der Burg published an article and paper:

Technical Report TUD-SERG-2012-010, Software Engineering Research Group, Delft, The Netherlands, April 2012.

Later, this similar paper relates the same approach:

The Chromium test team built "swarming.client", a test isolation tools that was also a big inspiration for this tool too:

using a similar approach to this tool.
  • And this article provides some good background on the same topic:

https://news.ycombinator.com/item?id=9356433 :

http://buildaudit.sourceforge.net/ is a related build tracing tool that handles ptrace directly ass opposed to rely on strace for tracing.

Electric cloud is tool has some ways to track which files are accessed during a build using ptrace of LD_PRELOAD (or a custom file system)

License

  • Apache-2.0 with an acknowledgement required to accompany the scan output.
  • Public domain CC-0 for reference datasets.
  • Multiple licenses (GPL2/3, LGPL, MIT, BSD, etc.) for third-party components.

See the NOTICE.txt file for more details and the thirdparty/ directory.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].