All Projects → wittawatj → interpretable-test

wittawatj / interpretable-test

Licence: MIT license
NeurIPS 2016. Linear-time interpretable nonparametric two-sample test.

Programming Languages

OpenEdge ABL
179 projects

Projects that are alternatives of or similar to interpretable-test

kernel-mod
NeurIPS 2018. Linear-time model comparison tests.
Stars: ✭ 17 (-70.69%)
Mutual labels:  kernel-methods
popmon
Monitor the stability of a Pandas or Spark dataframe ⚙︎
Stars: ✭ 434 (+648.28%)
Mutual labels:  statistical-tests
Jupyter-Notebooks-Statistic-Walk-Throughs-Using-R
Jupyter notebooks with examples of statistical methods and analyses using R.
Stars: ✭ 21 (-63.79%)
Mutual labels:  statistical-tests
hypothetical
Hypothesis and statistical testing in Python
Stars: ✭ 49 (-15.52%)
Mutual labels:  statistical-tests
tests-as-linear
Python port of "Common statistical tests are linear models" by Jonas Kristoffer Lindeløv.
Stars: ✭ 64 (+10.34%)
Mutual labels:  statistical-tests
URT
Fast Unit Root Tests and OLS regression in C++ with wrappers for R and Python
Stars: ✭ 70 (+20.69%)
Mutual labels:  statistical-tests
Shapley regressions
Statistical inference on machine learning or general non-parametric models
Stars: ✭ 37 (-36.21%)
Mutual labels:  statistical-tests
Awesome Graph Classification
A collection of important graph embedding, classification and representation learning papers with implementations.
Stars: ✭ 4,309 (+7329.31%)
Mutual labels:  kernel-methods
DSPKM
This is the page for the book Digital Signal Processing with Kernel Methods.
Stars: ✭ 32 (-44.83%)
Mutual labels:  kernel-methods
kafbox
A Matlab benchmarking toolbox for kernel adaptive filtering
Stars: ✭ 70 (+20.69%)
Mutual labels:  kernel-methods
KernelKnn
Kernel k Nearest Neighbors in R
Stars: ✭ 14 (-75.86%)
Mutual labels:  kernel-methods
frp
FRP: Fast Random Projections
Stars: ✭ 40 (-31.03%)
Mutual labels:  kernel-methods
supervised-random-projections
Python implementation of supervised PCA, supervised random projections, and their kernel counterparts.
Stars: ✭ 19 (-67.24%)
Mutual labels:  kernel-methods
kernel-ep
UAI 2015. Kernel-based just-in-time learning for expectation propagation
Stars: ✭ 16 (-72.41%)
Mutual labels:  kernel-methods
graphkit-learn
A python package for graph kernels, graph edit distances, and graph pre-image problem.
Stars: ✭ 87 (+50%)
Mutual labels:  kernel-methods

Interpretable Test

Build Status

17 April 2018: We updated the code base to provide support for both Python 3 and Python 2.7. Please contact Wittawat Jitkrittum if you found a bug.

The goal of this project is to learn a set of features to distinguish two given distributions P and Q, as observed through two samples. This task is formulated as a two-sample test problem. The features are chosen so as to maximize the distinguishability of the distributions, by optimizing a lower bound on test power for a statistical test using these features. The result is a parsimonious and interpretable indication of how and where two distributions differ locally (when the null hypothesis i.e., P=Q is rejected).

This repository contains a Python implementation of the Mean Embeddings (ME) test, and Smooth Characteristic Function (SCF) test in which features are automatically optimized as described in our paper

Interpretable Distribution Features with Maximum Testing Power
Wittawat Jitkrittum, Zoltán Szabó, Kacper Chwialkowski, Arthur Gretton
NIPS, 2016

How to install?

The package can be installed with the pip command.

pip install git+https://github.com/wittawatj/interpretable-test

Once installed, you should be able to do import freqopttest without any error.

Demo scripts

To get started, check demo_interpretable_test.ipynb which will guide you through from the beginning. There are many Jupyter notebooks in ipynb folder. Be sure to check them if you want to explore more.

Reproduce experimental results

Each experiment is defined in its own Python file with a name starting with exXX where XX is a number. All the experiment files are in freqopttest/ex folder. Each file is runnable with a command line argument. For example in ex1_power_vs_n.py, we aim to check the test power of each testing algorithm as a function of the sample size n. The script ex1_power_vs_n.py takes a dataset name as its argument. See run_ex1.sh which is a standalone Bash script on how to execute ex1_power_vs_n.py.

We used independent-jobs package to parallelize our experiments over a Slurm cluster (the package is not needed if you just need to use our developed two-sample tests). For example, for ex1_power_vs_n.py, a job is created for each combination of (dataset, algorithm, n, trial). If you do not use Slurm, you can change the line

engine = SlurmComputationEngine(batch_parameters)

to

engine = SerialComputationEngine()

which will instruct the computation engine to just use a normal for-loop on a single machine (will take a lot of time). Other computation engines that you use might be supported. See independent-jobs's repository page. For real-data experiments, all the preprocessed data are included in freqopttest/data/ as Pickle files. An experiment script will create a lot of results saved as Pickle files in freqopttest/result/exXX/ where XX is the experiment number. To plot these results, see the experiment's corresponding Jupyter notebook in the ipynb/ folder. For example, for ex1_power_vs_n.py see ipynb/ex1_results.ipynb to plot the results.

Preprocessed NIPS text collection

We will add a link to the proprocessed collection of NIPS papers from 1988 to 2015 that we used in the paper soon. All the scripts used will also be added. Stay tuned.

License

MIT license.

If you have questions or comments about anything regarding this work, please do not hesitate to contact Wittawat Jitkrittum.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].