All Projects → kthohr → Stats

kthohr / Stats

Licence: apache-2.0
A C++ header-only library of statistical distribution functions.

Programming Languages

cpp
1120 projects
cpp11
221 projects

Projects that are alternatives of or similar to Stats

Quant Notes
Quantitative Interview Preparation Guide, updated version here ==>
Stars: ✭ 180 (-38.36%)
Mutual labels:  statistics, probability, numerical-methods
wink-statistics
Fast & numerically stable statistical analysis
Stars: ✭ 36 (-87.67%)
Mutual labels:  probability, stats
URT
Fast Unit Root Tests and OLS regression in C++ with wrappers for R and Python
Stars: ✭ 70 (-76.03%)
Mutual labels:  eigen, openmp
Shendusuipian
To know stats by heart
Stars: ✭ 275 (-5.82%)
Mutual labels:  statistics, probability
Optim
OptimLib: a lightweight C++ library of numerical optimization methods for nonlinear functions
Stars: ✭ 411 (+40.75%)
Mutual labels:  openmp, eigen
Kratos
Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
Stars: ✭ 558 (+91.1%)
Mutual labels:  openmp, numerical-methods
math-stats
A small library that does the statistics for your numbers.
Stars: ✭ 18 (-93.84%)
Mutual labels:  statistics, stats
Stdlib
✨ Standard library for JavaScript and Node.js. ✨
Stars: ✭ 2,749 (+841.44%)
Mutual labels:  statistics, stats
PyImpetus
PyImpetus is a Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features
Stars: ✭ 83 (-71.58%)
Mutual labels:  statistics, probability
mongodb-info
MongoDB Info - command line tool to get stats.
Stars: ✭ 13 (-95.55%)
Mutual labels:  statistics, stats
Wallace Cli
Pretty CSS analytics on the CLI
Stars: ✭ 281 (-3.77%)
Mutual labels:  statistics, stats
Armadillo Code
Armadillo: fast C++ library for linear algebra & scientific computing - http://arma.sourceforge.net
Stars: ✭ 388 (+32.88%)
Mutual labels:  openmp, statistics
Stanford Cme 106 Probability And Statistics
VIP cheatsheets for Stanford's CME 106 Probability and Statistics for Engineers
Stars: ✭ 242 (-17.12%)
Mutual labels:  statistics, probability
Dive Into Ml System
Dive into machine learning system, start from reinventing the wheel.
Stars: ✭ 220 (-24.66%)
Mutual labels:  openmp, eigen
Sportsipy
A free sports API written for python
Stars: ✭ 229 (-21.58%)
Mutual labels:  statistics, stats
Data-Science-and-Machine-Learning-Resources
List of Data Science and Machine Learning Resource that I frequently use
Stars: ✭ 19 (-93.49%)
Mutual labels:  statistics, probability
stats
📊 Request statistics middleware that stores response times, status code counts, etc
Stars: ✭ 15 (-94.86%)
Mutual labels:  statistics, stats
Scikit Posthocs
Multiple Pairwise Comparisons (Post Hoc) Tests in Python
Stars: ✭ 186 (-36.3%)
Mutual labels:  statistics, stats
Stat
Statistics package for Go [DEPRECATED]
Stars: ✭ 198 (-32.19%)
Mutual labels:  statistics, stats
data-science-notes
Open-source project hosted at https://makeuseofdata.com to crowdsource a robust collection of notes related to data science (math, visualization, modeling, etc)
Stars: ✭ 52 (-82.19%)
Mutual labels:  statistics, probability

StatsLib   Mentioned in Awesome Cpp Build Status Coverage Status Codacy Badge License Documentation Status

StatsLib is a templated C++ library of statistical distribution functions, featuring unique compile-time computing capabilities and seamless integration with several popular linear algebra libraries.

Features:

  • A header-only library of probability density functions, cumulative distribution functions, quantile functions, and random sampling methods.
  • Functions are written in C++11 constexpr format, enabling the library to operate as both a compile-time and run-time computation engine.
  • Designed with a simple R-like syntax.
  • Optional vector-matrix functionality with wrappers to support:
  • Matrix-based operations are parallelizable with OpenMP.
  • Released under a permissive, non-GPL license.

Contents:

Distributions

Functions to compute the cdf, pdf, quantile, as well as random sampling methods, are available for the following distributions:

  • Bernoulli
  • Beta
  • Binomial
  • Cauchy
  • Chi-squared
  • Exponential
  • F
  • Gamma
  • Inverse-Gamma
  • Laplace
  • Logistic
  • Log-Normal
  • Normal (Gaussian)
  • Poisson
  • Student's t
  • Uniform
  • Weibull

In addition, pdf and random sampling functions are available for several multivariate distributions:

  • inverse-Wishart
  • Multivariate Normal
  • Wishart

Installation and Dependencies

StatsLib is a header-only library. Simply add the header files to your project using

#include "stats.hpp"

The only dependency is the latest version of GCEM and a C++11 compatible compiler.

Documentation

Full documentation is available online:

Documentation Status

A PDF version of the documentation is available here.

Jupyter Notebook

You can test the library online using an interactive Jupyter notebook:

Binder

Options

The following options should be declared before including the StatsLib header files.

  • For inline-only functionality (i.e., no constexpr specifiers):
#define STATS_GO_INLINE
  • OpenMP functionality is enabled by default if the _OPENMP macro is detected (e.g., by invoking -fopenmp with GCC or Clang). To explicitly enable OpenMP features use:
#define STATS_USE_OPENMP
  • To disable OpenMP functionality:
#define STATS_DONT_USE_OPENMP
  • To use StatsLib with Armadillo, Blaze or Eigen:
#define STATS_ENABLE_ARMA_WRAPPERS
#define STATS_ENABLE_BLAZE_WRAPPERS
#define STATS_ENABLE_EIGEN_WRAPPERS
  • To enable wrappers for std::vector:
#define STATS_ENABLE_STDVEC_WRAPPERS

Syntax and Examples

Functions are called using an R-like syntax. Some general rules:

  • density functions: stats::d*. For example, the Normal (Gaussian) density is called using
stats::dnorm(<value>,<mean parameter>,<standard deviation>);
  • cumulative distribution functions: stats::p*. For example, the Gamma CDF is called using
stats::pgamma(<value>,<shape parameter>,<scale parameter>);
  • quantile functions: stats::q*. For example, the Beta quantile is called using
stats::qbeta(<value>,<a parameter>,<b parameter>);
  • random sampling: stats::r*. For example, to generate a single draw from the Logistic distribution:
stats::rlogis(<location parameter>,<scale parameter>,<seed value or random number engine>);

All of these functions have matrix-based equivalents using Armadillo, Blaze, and Eigen dense matrices.

  • The pdf, cdf, and quantile functions can take matrix-valued arguments. For example,
// Using Armadillo:
arma::mat norm_pdf_vals = stats::dnorm(arma::ones(10,20),1.0,2.0);
  • The randomization functions (r*) can output random matrices of arbitrary size. For example, For example, the following code will generate a 100-by-50 matrix of iid draws from a Gamma(3,2) distribution:
// Armadillo:
arma::mat gamma_rvs = stats::rgamma<arma::mat>(100,50,3.0,2.0);
// Blaze:
blaze::DynamicMatrix<double> gamma_rvs = stats::rgamma<blaze::DynamicMatrix<double>>(100,50,3.0,2.0);
// Eigen:
Eigen::MatrixXd gamma_rvs = stats::rgamma<Eigen::MatrixXd>(100,50,3.0,2.0);
  • All matrix-based operations are parallelizable with OpenMP. For GCC and Clang compilers, simply include the -fopenmp option during compilation.

Seeding Values

Random number seeding is available in two forms: seed values and random number engines.

  • Seed values are passed as unsigned integers. For example, to generate a draw from a normal distribution N(1,2) with seed value 1776:
stats::rnorm(1,2,1776);
  • Random engines in StatsLib use the 64-bit Mersenne-Twister generator (std::mt19937_64) and are passed by reference. Example:
std::mt19937_64 engine(1776);
stats::rnorm(1,2,engine);

Examples

More examples with code:

// evaluate the normal PDF at x = 1, mu = 0, sigma = 1
double dval_1 = stats::dnorm(1.0,0.0,1.0);
 
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
double dval_2 = stats::dnorm(1.0,0.0,1.0,true);
 
// evaluate the normal CDF at x = 1, mu = 0, sigma = 1
double pval = stats::pnorm(1.0,0.0,1.0);
 
// evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
double qval = stats::qlaplace(0.1,0.0,1.0);

// draw from a t-distribution dof = 30
double rval = stats::rt(30);

// matrix output
arma::mat beta_rvs = stats::rbeta<arma::mat>(100,100,3.0,2.0);
// matrix input
arma::mat beta_cdf_vals = stats::pbeta(beta_rvs,3.0,2.0);

Compile-time Computing Capabilities

StatsLib is designed to operate equally well as a compile-time computation engine. Compile-time computation allows the compiler to replace function calls (e.g., dnorm(0,0,1)) with static values in the source code. That is, functions are evaluated during the compilation process, rather than at run-time. This capability is made possible due to the templated constexpr design of the library and can be verified by inspecting the assembly code generated by the compiler.

The compile-time features are enabled using the constexpr specifier. The example below computes the pdf, cdf, and quantile function of the Laplace distribution.

#include "stats.hpp"

int main()
{
    
    constexpr double dens_1  = stats::dlaplace(1.0,1.0,2.0); // answer = 0.25
    constexpr double prob_1  = stats::plaplace(1.0,1.0,2.0); // answer = 0.5
    constexpr double quant_1 = stats::qlaplace(0.1,1.0,2.0); // answer = -2.218875...

    return 0;
}

Assembly code generated by Clang without any optimization:

LCPI0_0:
	.quad	-4611193153885729483    ## double -2.2188758248682015
LCPI0_1:
	.quad	4602678819172646912     ## double 0.5
LCPI0_2:
	.quad	4598175219545276417     ## double 0.25000000000000006
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_main
	.p2align	4, 0x90
_main:                                  ## @main
	push	rbp
	mov	rbp, rsp
	xor	eax, eax
	movsd	xmm0, qword ptr [rip + LCPI0_0] ## xmm0 = mem[0],zero
	movsd	xmm1, qword ptr [rip + LCPI0_1] ## xmm1 = mem[0],zero
	movsd	xmm2, qword ptr [rip + LCPI0_2] ## xmm2 = mem[0],zero
	mov	dword ptr [rbp - 4], 0
	movsd	qword ptr [rbp - 16], xmm2
	movsd	qword ptr [rbp - 24], xmm1
	movsd	qword ptr [rbp - 32], xmm0
	pop	rbp
	ret

Author

Keith O'Hara

License

Apache Version 2

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].