All Projects → mateuszbaran → CovarianceEstimation.jl

mateuszbaran / CovarianceEstimation.jl

Licence: MIT license
Lightweight robust covariance estimation in Julia

Programming Languages

julia
2034 projects
matlab
3953 projects
python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to CovarianceEstimation.jl

lme4qtl
Mixed models @lme4 + custom covariances + parameter constraints
Stars: ✭ 39 (+25.81%)
Mutual labels:  covariance
combining3Dmorphablemodels
Project Page of Combining 3D Morphable Models: A Large scale Face-and-Head Model - [CVPR 2019]
Stars: ✭ 80 (+158.06%)
Mutual labels:  covariance
RiskPortfolios
Functions for the construction of risk-based portfolios
Stars: ✭ 43 (+38.71%)
Mutual labels:  covariance
Pyportfolioopt
Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
Stars: ✭ 2,502 (+7970.97%)
Mutual labels:  covariance
wink-statistics
Fast & numerically stable statistical analysis
Stars: ✭ 36 (+16.13%)
Mutual labels:  covariance
T-CorEx
Implementation of linear CorEx and temporal CorEx.
Stars: ✭ 31 (+0%)
Mutual labels:  covariance-estimation
SpatPCA
R Package: Regularized Principal Component Analysis for Spatial Data
Stars: ✭ 16 (-48.39%)
Mutual labels:  covariance-estimation
Status Coverage Docs
CI codecov.io

CovarianceEstimation.jl

Lightweight robust covariance estimation in Julia i.e. if you have a data matrix X of size n×p corresponding to n observations with p features, this package will help you to obtain an estimator of the covariance matrix of size p×p associated with this data.

Note: if you are interested in covariance estimation in the context of a linear regression, consider for now the package CovarianceMatrices.jl which focuses around that case.

Quick start

using CovarianceEstimation

X = randn(5, 7)

S_uncorrected  = cov(SimpleCovariance(), X)
S_corrected    = cov(SimpleCovariance(corrected=true), X)

# using linear shrinkage with different targets
LSE = LinearShrinkage
# - Ledoit-Wolf target + shrinkage
method = LSE(ConstantCorrelation())
S_ledoitwolf = cov(method, X)
# - Chen target + shrinkage (using the more verbose call)
method = LSE(target=DiagonalCommonVariance(), shrinkage=:rblw)
S_chen_rblw = cov(method, X)
method = LSE(target=DiagonalCommonVariance(), shrinkage=:oas)
S_chen_oas = cov(method, X)

# a pre-defined shrinkage can be used as well
method = LinearShrinkage(DiagonalUnitVariance(), 0.5)
# using a given shrinkage
S_05 = cov(method, X)

Currently supported algorithms

In this section, X is the data matrix of size n × p, S is the sample covariance matrix with S = κ (Xc' * Xc) where κ is either n (uncorrected) or n-1 (corrected) and Xc is the centred data matrix (see docs).

  • SimpleCovariance: basic corrected or uncorrected sample covariance (implemented in StatsBase.jl).

Time complexity: O(p²n) with a low constant

Sample covariance based methods

These methods build an estimator of the covariance derived from S. They are implemented using abstract covariance estimation interface from StatsBase.jl.

  • LinearShrinkage: James-Stein type estimator of the form (1-λ)S+λF where F is a target and λ∈[0,1] a shrinkage intensity.
    • common targets are implemented following the taxonomy given in [1] along with Ledoit-Wolf optimal shrinkage intensities [2].
    • in the case of the DiagonalCommonVariance target, a Rao-Blackwellised Ledoit-Wolf shrinkage (:rblw) and Oracle-Approximating shrinkage (:oas) are also supported (see [3]).
    • Note: S is symmetric semi-positive definite so that if the F is symmetric positive definite and provided λ is non-zero, the estimator obtained after shrinkage is also symmetric positive definite. For the diagonal targets DiagonalUnitVariance, DiagonalCommonVariance and DiagonalUnequalVariance the target is necessarily SPD.
  • AnalyticalNonlinearShrinkage: estimator of the form MΛM' where M and Λ are matrices derived from the eigen decomposition of S.[4]
  • BiweightMidcovariance: robust estimator, described more in the documentation.[5]

Time complexity:

  • Linear shrinkage: O(p²n) with a low constant (main cost is forming S)
  • Nonlinear shrinkage:
    • if p<n: O(p²n + n²) with a moderate constant (main cost is forming S and manipulating a matrix of elements)
    • if p>n: O(p³) with a low constant (main cost is computing the eigen decomposition of S).
  • Biweight midcovariance: O(p²n)

Other estimators (coming)

These are estimators that may be implemented in the future, see also this review paper.

  • Sparsity based estimators for either the covariance or the precision
  • Rank based approaches
  • POET
  • HAC
  • ...

For HAC (and other estimators of covariance of coefficient of regression models) you can currently use the CovarianceMatrices.jl package.

Comparison to existing libraries

Rough benchmarks are run over random matrices of various sizes (40x20, 20x40, 400x200, 200x400). These benchmarks should (as usual) be taken with a pinch of salt but essentially a significant speedup should be expected for a standard problem.

  • Sklearn (implements DiagonalCommonVariance target with oas and lw shrinkage)
    • average speedup: 5x
  • Corpcor (implements DiagonalUnequalVariance target with ss shrinkage)
    • average speedup: 22x
  • Ledoit-Wolf 1 (implements ConstantCorrelation target with lw shrinkage, we used Octave for the comparison)
    • average speedup: 12x
  • Ledoit-Wolf 2 (implements AnalyticalNonlinearShrinkage)
    • average speedup: 25x
  • Astropy (implements BiweightMidcovariance)
    • average speedup: 3x

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].