All Projects → scalacenter → Scalac Profiling

scalacenter / Scalac Profiling

Licence: apache-2.0
Implementation of SCP-010.

Programming Languages

scala
5932 projects
macros
77 projects

Projects that are alternatives of or similar to Scalac Profiling

Gtsummary
Presentation-Ready Data Summary and Analytic Result Tables
Stars: ✭ 450 (+400%)
Mutual labels:  statistics, reproducibility
Pumas.jl
Pharmaceutical Modeling and Simulation for Nonlinear Mixed Effects (NLME), Quantiative Systems Pharmacology (QsP), Physiologically-Based Pharmacokinetics (PBPK) models mixed with machine learning
Stars: ✭ 84 (-6.67%)
Mutual labels:  statistics
Hyperlearn
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (+1237.78%)
Mutual labels:  statistics
Bat.jl
A Bayesian Analysis Toolkit in Julia
Stars: ✭ 82 (-8.89%)
Mutual labels:  statistics
Linqstatistics
Linq extensions to calculate basic statistics
Stars: ✭ 78 (-13.33%)
Mutual labels:  statistics
Orgstat
Statistics visualizer for org-mode
Stars: ✭ 83 (-7.78%)
Mutual labels:  statistics
Projpred
Projection predictive variable selection
Stars: ✭ 76 (-15.56%)
Mutual labels:  statistics
Simplestatistics
🎲 Simple statistical functions implemented in readable Python.
Stars: ✭ 88 (-2.22%)
Mutual labels:  statistics
Memcache Info
Simple and efficient way to show information about Memcache.
Stars: ✭ 84 (-6.67%)
Mutual labels:  statistics
Openml R
R package to interface with OpenML
Stars: ✭ 81 (-10%)
Mutual labels:  statistics
Sbt Jni
sbt plugin to ease working with JNI
Stars: ✭ 81 (-10%)
Mutual labels:  sbt-plugin
Superseriousstats
superseriousstats is a fast and efficient program to create statistics out of various types of chat logs
Stars: ✭ 78 (-13.33%)
Mutual labels:  statistics
Weightedcalcs
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
Stars: ✭ 83 (-7.78%)
Mutual labels:  statistics
Github Traffic
Get the Github traffic for the specified repository
Stars: ✭ 77 (-14.44%)
Mutual labels:  statistics
Wp Ulike
WP ULike enables you to add Ajax Like button into your WordPress and allowing your visitors to like and unlike posts,comments, BuddyPress activities & bbPress Topics
Stars: ✭ 84 (-6.67%)
Mutual labels:  statistics
Sbt Api Mappings
An Sbt plugin that fills apiMappings for common Scala libraries.
Stars: ✭ 76 (-15.56%)
Mutual labels:  sbt-plugin
Sbt Dependency Graph
sbt plugin to create a dependency graph for your project
Stars: ✭ 1,223 (+1258.89%)
Mutual labels:  sbt-plugin
Awesome time series in python
This curated list contains python packages for time series analysis
Stars: ✭ 1,245 (+1283.33%)
Mutual labels:  statistics
Zelig
A statistical framework that serves as a common interface to a large range of models
Stars: ✭ 89 (-1.11%)
Mutual labels:  statistics
Pypistats
Command-line interface to PyPI Stats API to get download stats for Python packages
Stars: ✭ 86 (-4.44%)
Mutual labels:  statistics

Providing Better Compilation Performance Information

Build Status

When compile times become a problem, how can Scala developers reason about the relation between their code and compile times?

Install

Add scalac-profiling in any sbt project by specifying the following project setting.

addCompilerPlugin("ch.epfl.scala" %% "scalac-profiling" % "1.0.0")

How to use

To learn how to use the plugin, read Speeding Up Compilation Time with scalac-profiling in the scala-lang blog.

Compiler plugin options

All the compiler plugin options are prepended by -P:scalac-profiling:.

  • show-profiles: Show implicit searches and macro expansions by type and call-site.
  • sourceroot: Tell the plugin what is the source directory of the project. Example: -P:scalac-profiling:sourceroot:$PROJECT_BASE_DIR.
  • print-search-result: Print the result retrieved by an implicit search. Example: -P:scalac-profiling:print-search-result:$MACRO_ID.
  • generate-macro-flamegraph: Generate a flamegraph for macro expansions. The flamegraph for implicit searches is enabled by default.
  • print-failed-implicit-macro-candidates: Print trees of all failed implicit searches that triggered a macro expansion.
  • no-profiledb: Recommended. Don't generate profiledb (will be on by default in a future release).
  • show-concrete-implicit-tparams: Use more concrete type parameters in the implicit search flamegraph. Note that it may change the shape of the flamegraph.

Goal of the project

The goal of this proposal is to allow Scala developers to optimize their codebase to reduce compile times, spotting inefficient implicit searches, expanded macro code, and other reasons that slow down compile times and decrease developer productivity.

This repository holds the compiler plugin and a fork of mainstream scalac that will be eventually be merged upstream. This work is prompted by Morgan Stanley's proposal and was approved in our last advisory board.

Scalac status

The required changes to the compiler, Scalac, are the following:

  1. Collect all statistics and optimize checks.
  2. Initialize statistics per global.
  3. Add extra timers and counters.

Information about the setup

The project uses a forked scalac version that is used to compile both the compiler plugin and several OSS projects from the community. The integration tests are for now Circe and Monocle, and they help us look into big profiling numbers and detect hot spots and misbehaviours.

If you think a particular codebase is a good candidate to become an integration test, please open an issue.

Plan

The proposal is divided into three main areas:

  1. Data generation and capture.
  2. Data visualisation and comparison.
  3. Reproducibility.

How to tackle each of these problems to make the implementation successful?

Data generation and capture

The generation of data comes from the guts of the compiler. To optimize for impact, the collection of information is done in two different places (a compiler plugin and a forked scalac).

Project structure

  1. A forked scalac with patches to collect profiling information. All changes are expected to be ported upstream. This fork is not required anymore because all the changes are already present in Scala 2.12.5.
  2. A compiler plugin to get information from the macro infrastructure independently of the used Scalac version.
  3. Profiledb readers and writers to allow IDEs and editors to read and write profiledb's.
  4. A proof-of-concept vscode integration that displays the data collected from the profiledb.
  5. An sbt plugin for reproducibility that warms up the compiler before profiling.

The work is split into two parts so that Scala developers that are stuck in previous Scala versions can use the compiler plugin to get some profiling information about macros.

This structure is more practical because it allow us to evolve things faster in the compiler plugin, or put there things that cannot be merged upstream.

Data visualisation and comparison

The profiling data will be accessible in two different ways (provided that the pertinent profiling flags are enabled):

  1. A summary of the stats will be printed out in every compile run.
  2. A protobuf file will be generated at the root of the class files directory.
    • The file is generated via protobuf so that it's backwards and forwards binary compatible
    • The protobuf file will contain all the profiling information.

Why a protobuf file instead of a JSON file? Forwards and backwards binary compatibility is important -- we want our tooling to be able to read files generated by previous or upcoming versions of the compiler. Our goal is to create a single tool that all IDEs and third-party tools use to parse and interpret the statistics from JARs and compile runs.

We're collaborating with Intellij to provide some of the statistics within the IDE (e.g. macro invocations or implicit searches per line). We have some ideas to show this information as heat map in the future.

Reproducibility

Getting reproducible numbers is important to reason about the code and identifying when a commit increases or decreases compile times with certainty.

To do so, several conditions must be met: the compiler must be warmed up, the load in the running computer must be low, and the hardware must be tweaked to disable options that make executions non reproducible (like Turbo Boost).

However, this warming up cannot be done in an isolated scenario as Scalac's benchmarking infrastructure does because it doesn't measure the overhead of the build tool calling the compiler, which can be significant (e.g. in sbt).

As a result, reproducibility must be achieved in the build tool itself. The goal of this project is to provide an sbt plugin that warms up the compiler by a configurable amount of time. It also bundles recommendations and tips on how and where to run compilation.

Collected data

In the following sections, I elaborate on the collected data that we want to extract from the compiler as well as technical details for every section in the original proposal.

Information about macros

Per call-site, file and total:

  • [x] How many macros are expanded?
  • [x] How long do they take to run?
  • [x] How many tree nodes do macros create?

Information about implicit search

Getting hold of this information requires changes in mainstream scalac.

Per call-site, file and total:

  • [x] How many implicit searches are triggered per position?
  • [x] How many implicit searches are triggered for a given type?
  • [x] How long implicit searches take to run?
  • [x] How many implicit search failures are?
  • [x] How many implicit search hits are?
  • [x] What's the ratio of search failures/hits?

Results

These are the requirements that the proposal lays out.

Note that in some cases, this plugin provides more information than the requested by the original proposal.

What the proposal wants

  • [x] Compilation time totally (this is provided by -Ystatistics)
  • [x] Macro details
    • [x] Time per file
    • [x] Time per macro
      • [x] Invocations
      • [x] Per type
      • [x] Total time
    • [x] Flamegraph of all macros
  • [x] Implicit search details (time and number)
    • [x] By type
    • [x] By invocation (only number for now)
    • [x] By file (can be aggregated from the "by invocation" data)
    • [x] Flamegraph of all the implicit searches
  • [x] User time, kernel time, wall clock, I/O time.
    This feature was already provided by Scalac, implemented in this PR.
  • [x] Time for flagged features (for certain features – e.g. optimisation)
    • The best way to capture this information is running statistics for the compiler with and without optimization, and compare the profiles. There are also some extra counters.
  • [x] Time resolving types from classpath
    • [x] Total
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].