All Projects → brian-lau → highdim

brian-lau / highdim

Licence: GPL-3.0 License
Statistics for high-dimensional data (homogeneity, sphericity, independence, spherical uniformity)

Programming Languages

matlab
3953 projects
c
50402 projects - #5 most used programming language
M
324 projects

Projects that are alternatives of or similar to highdim

BayesHMM
Full Bayesian Inference for Hidden Markov Models
Stars: ✭ 35 (+118.75%)
Mutual labels:  statistics
Probability Theory
A quick introduction to all most important concepts of Probability Theory, only freshman level of mathematics needed as prerequisite.
Stars: ✭ 25 (+56.25%)
Mutual labels:  statistics
phpstats
CLI Statistics and dependency graphs for PHP
Stars: ✭ 61 (+281.25%)
Mutual labels:  statistics
laravel-model-stats
Model statistics dashboard for your Laravel Application
Stars: ✭ 186 (+1062.5%)
Mutual labels:  statistics
php-statistics
Statistics library for PHP
Stars: ✭ 32 (+100%)
Mutual labels:  statistics
Data-Science-Sandbox
Code and resources to serve as a starting point for data science projects.
Stars: ✭ 14 (-12.5%)
Mutual labels:  statistics
Euler
The open-source computational framework for the Swift language
Stars: ✭ 37 (+131.25%)
Mutual labels:  statistics
MojangSharp
A C# wrapper library for Mojang API (no longer actively maintained)
Stars: ✭ 38 (+137.5%)
Mutual labels:  statistics
teach-r-online
Materials for the Teaching statistics and data science online workshops in July 2020
Stars: ✭ 52 (+225%)
Mutual labels:  statistics
mode-line-stats
A bunch of easy to set up stats for the Emacs mode-line.
Stars: ✭ 27 (+68.75%)
Mutual labels:  statistics
goodreads-toolbox
9 tools for Goodreads.com, for finding people based on the books they’ve read, finding books popular among the people you follow, following new book reviews, etc
Stars: ✭ 56 (+250%)
Mutual labels:  statistics
mortAAR
R Package - Analysis of Archaeological Mortality Data
Stars: ✭ 13 (-18.75%)
Mutual labels:  statistics
Mixpanel-Statistics
Perform statistics on Mixpanel API data
Stars: ✭ 26 (+62.5%)
Mutual labels:  statistics
d3-boxplot
d3js box plot plugin
Stars: ✭ 21 (+31.25%)
Mutual labels:  statistics
covid19 statistics
Aplicação para acompanhamento das estatísticas do COVID-19 no Brasil 🦠
Stars: ✭ 34 (+112.5%)
Mutual labels:  statistics
ThunderStats
This addon adds awesome statistics to your beloved Thunderbird!
Stars: ✭ 34 (+112.5%)
Mutual labels:  statistics
wp-analytify
Google Analytics Dashboard Plugin For WordPress By Analytify
Stars: ✭ 20 (+25%)
Mutual labels:  statistics
renko trend following strategy catalyst
Example of adaptive trend following strategy based on Renko
Stars: ✭ 65 (+306.25%)
Mutual labels:  statistics
GeostatisticsLessonsNotebooks
These are python notebooks accompanying Lessons available at GeostatisticsLessons.com
Stars: ✭ 28 (+75%)
Mutual labels:  statistics
craft-retour
Retour allows you to intelligently redirect legacy URLs, so that you don't lose SEO value when rebuilding & restructuring a website
Stars: ✭ 32 (+100%)
Mutual labels:  statistics

highdim

A Matlab library for statistical testing of high-dimensional data, including one and two-sample tests for homogeneity, uniformity, sphericity and independence. Of note are implementations of some modern tests appropriate for data where dimensionality grows with samples size, possibly exceeding the number of samples.

Installation

Download highdim and add the resulting folder to your Matlab path. Folders prefixed by a + are packages that should not be explicitly added to your path, although their parent folder should be.

The Statistics toolbox is required.

Examples

The various tests are most easily accessed through three interfaces: DepTest1, DepTest2 and UniSphereTest for one-sample tests, two-sample tests and one-sample tests on the sphere, respectively.

Detailed simulations of size, power and comparisons between tests are available in the wiki. The examples below give an idea of what's available.

Multivariate (In)dependence, Sphericity and Homogeneity

% Independent, but non-spherical data
sigma = diag([ones(1,25),0.5*ones(1,5)]);
x = (sigma*randn(50,30)')';

% Independence tests (Han & Liu, 2014)
DepTest1(x,'test','spearman') 
DepTest1(x,'test','kendall') 

% Sphericity tests (Ledoit & Wolf, 2002; Wang & Yao, 2013; Zou et al., 2014)
DepTest1(x,'test','john')
DepTest1(x,'test','wang')
DepTest1(x,'test','sign')
DepTest1(x,'test','bcs')
% Non-indepedent data, with ~0 correlation, from the same distribution
x = rand(200,1); y = rand(200,1);
xx = 0.5*(x+y)-0.5; yy = 0.5*(x-y);
corr(xx,yy)

% Two-sample Independence tests (Gretton et al, 2008; Szekely & Rizzo, 2013)
DepTest2(xx,yy,'test','dcorr') % Distance correlation t-test
DepTest2(xx,yy,'test','hsic') % Hilbert Schmidt Independence Criterion

% Do the samples come from the same distribution? (Gretton et al, 2012; Szekely et al. 2007)
DepTest2(xx,yy,'test','mmd') % Maximum mean discrepancy
DepTest2(xx,yy,'test','energy') % statistical energy
% Independent data, different distributions
x = randn(200,1); y = rand(200,1);

% Two-sample Independence tests
DepTest2(x,y,'test','dcorr')
DepTest2(x,y,'test','hsic')

% Do the samples come from the same distribution?
DepTest2(x,y,'test','mmd')
DepTest2(x,y,'test','energy')

Differences in multivariate means and covariances

% Two high-dimensional samples with sparse difference in covariance matrix (4 entries)
p = 50; n = 100;
for ii = 1:p
   for jj = 1:p
      sigma(ii,jj) = 0.5^abs(ii-jj);
   end
end
D = diag(unifrnd(0.5,2.5,p,1));
S = D^.5*sigma*D^.5; U = zeros(p,p);
[~,~,k] = utils.tri2sqind(p);
r = randperm(numel(k));
U(k(r(1:4))) = unifrnd(0,4,4,1)*max(diag(S));
U = U + U';
[~,da] = eig(S); [~,db] = eig(S+U);
d = abs(min([diag(da);diag(db)])) + 0.05;

x = mvnrnd(zeros(1,p),S+d*eye(p),n);
y = mvnrnd(zeros(1,p),S+U+d*(eye(p)),n);

DepTest2(x,y,'test','covdiff')

% Directly calling the test returns M, a matrix indicating where covariance 
% elements are significantly different (FWER controlled at alpha)
[pval,stat,M] = diff.covtest(x,y);

Uniformity on hypersphere

% Non-uniform samples, antipodally distributed on the sphere
sigma = diag([1 5 1]);
x = (sigma*randn(50,3)')';

% Is projection onto unit hypersphere uniformly distributed?
UniSphereTest(x,'test','rayleigh') % Rayleigh test fails since resultant is zero
UniSphereTest(x,'test','gine-ajne') % Weighted Gine-Ajne
UniSphereTest(x,'test','randproj') % random projection
UniSphereTest(x,'test','bingham') % Bingham

Contributions

Copyright (c) 2017 Brian Lau [email protected], see LICENSE

Please feel free to fork and contribute!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].