All Projects → bank-of-england → Shapley_regressions

bank-of-england / Shapley_regressions

Licence: other
Statistical inference on machine learning or general non-parametric models

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Shapley regressions

pomp
R package for statistical inference using partially observed Markov processes
Stars: ✭ 88 (+137.84%)
Mutual labels:  time-series, statistical-inference, simulation-modeling
Easypr
An easy, flexible, and accurate plate recognition project for Chinese licenses in unconstrained situations.
Stars: ✭ 6,046 (+16240.54%)
Mutual labels:  artificial-neural-networks, support-vector-machines
Hyperparameter Optimization Of Machine Learning Algorithms
Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models (easy&clear)
Stars: ✭ 516 (+1294.59%)
Mutual labels:  random-forest, artificial-neural-networks
ICC-2019-WC-prediction
Predicting the winner of 2019 cricket world cup using random forest algorithm
Stars: ✭ 41 (+10.81%)
Mutual labels:  random-forest, support-vector-machines
compv
Insanely fast Open Source Computer Vision library for ARM and x86 devices (Up to #50 times faster than OpenCV)
Stars: ✭ 155 (+318.92%)
Mutual labels:  artificial-neural-networks, support-vector-machines
LSTM-Time-Series-Analysis
Using LSTM network for time series forecasting
Stars: ✭ 41 (+10.81%)
Mutual labels:  time-series, random-forest
hypothetical
Hypothesis and statistical testing in Python
Stars: ✭ 49 (+32.43%)
Mutual labels:  statistical-inference, statistical-tests
Java Deep Learning Cookbook
Code for Java Deep Learning Cookbook
Stars: ✭ 156 (+321.62%)
Mutual labels:  time-series, artificial-neural-networks
AIML-Projects
Projects I completed as a part of Great Learning's PGP - Artificial Intelligence and Machine Learning
Stars: ✭ 85 (+129.73%)
Mutual labels:  random-forest, support-vector-machines
R-stats-machine-learning
Misc Statistics and Machine Learning codes in R
Stars: ✭ 33 (-10.81%)
Mutual labels:  random-forest, support-vector-machines
Emotion-recognition-from-tweets
A comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-54.05%)
Mutual labels:  support-vector-machines
Time-Series-Forecasting
Rainfall analysis of Maharashtra - Season/Month wise forecasting. Different methods have been used. The main goal of this project is to increase the performance of forecasted results during rainy seasons.
Stars: ✭ 27 (-27.03%)
Mutual labels:  time-series
PlotTwist
PlotTwist - a web app for plotting and annotating time-series data
Stars: ✭ 21 (-43.24%)
Mutual labels:  time-series
ewstools
Python package for early warning signals (EWS) of bifurcations in time series data.
Stars: ✭ 29 (-21.62%)
Mutual labels:  time-series
Deep XF
Package towards building Explainable Forecasting and Nowcasting Models with State-of-the-art Deep Neural Networks and Dynamic Factor Model on Time Series data sets with single line of code. Also, provides utilify facility for time-series signal similarities matching, and removing noise from timeseries signals.
Stars: ✭ 83 (+124.32%)
Mutual labels:  time-series
wrench
WRENCH: Cyberinfrastructure Simulation Workbench
Stars: ✭ 25 (-32.43%)
Mutual labels:  simulation-modeling
xforest
A super-fast and scalable Random Forest library based on fast histogram decision tree algorithm and distributed bagging framework. It can be used for binary classification, multi-label classification, and regression tasks. This library provides both Python and command line interface to users.
Stars: ✭ 20 (-45.95%)
Mutual labels:  random-forest
dlime experiments
In this work, we propose a deterministic version of Local Interpretable Model Agnostic Explanations (LIME) and the experimental results on three different medical datasets shows the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME).
Stars: ✭ 21 (-43.24%)
Mutual labels:  random-forest
sknifedatar
sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.
Stars: ✭ 30 (-18.92%)
Mutual labels:  time-series
ClassifierToolbox
A MATLAB toolbox for classifier: Version 1.0.7
Stars: ✭ 72 (+94.59%)
Mutual labels:  support-vector-machines

Shapley regressions code base (BoE SWP 784)

This repository provides the code, data and results used for Bank of England Staff Working Paper 784

"Shapley regressions: A framework for statistical inference on machine learning models"

by Andreas Joseph (March 2019).

The paper introduces a well-motivated and rigorous approach to address the black-box critique of machine learning models. Model interpretability is transferred to a multiple linear regression analysis - one of the most transparent and most widely used modelling techniques.

The output of machine learning models can now be presented as a regression table. The example below shows inference results for modelling changes in UK and US unemployment using quarterly macroeconomic time series. It compares several machine learning models (columns 1-3 for each country) with a linear regression (Reg column). As expected, all models learn similar variable dependencies, while machine learning models are generally more accurate (RMSE) and provide richer information, e.g. about non-linearity of the data generating process. Please see Table 4 in the paper for technical details.

The material provided here allows to reproduce all empirical and simulation results in the paper. It is not intended as a stand-alone package. However, parts of it may be transfered to other applications. No warranty is given. Please consult the licence file.

Should you have any queries or spot an issue, please email to [email protected] or raise an Issue within the repository.

Link to paper: www.bankofengland.co.uk/working-paper/2019/shapley-regressions-a-framework-for-statistical-inference-on-machine-learning-models

Download of full results: https://www.dropbox.com/s/bkdjpbqrabgtwr4/SWP784_all_results.zip?dl=0

Code structure

- 1_macro_Shapley_regressions.py: UK and US macroeconomic time series analysis using 
	machine learning (ML) models and Shapley regressions for statistical inference (section 5.2 of paper).
- 2a_ML_inference_simulation.py: Simulation of polynomial data-generating processes and
	ML inference based on Shapley decompositions and reconstruction
	(suited for parallel/cloud processing, section 5.1 of paper).
- 2b_ML_inference_analysis.py: Collection of simulation results and graphical output (section 5.1 of paper). 
- ML_inference_aux.py: Auxiliary code for parts 1 and 2, application-specific inputs and 
	general functions (partly inherited from https://github.com/andi-jo/ML_projection_toolbox).
	shapley_coeffs() calculates Shapley share coefficients (SSC).

Instructions

- Parts 1 and 2 are independent from each other.
- Part 2b depends on 2a or on pre-computed results (SWP results are provided in
	ML_inf_joint_results_swp.pkl).
- The "main_dir" variable needs to be set in both parts.
- options can be set at the beginning of parts 1 and 2 (a and b).
- Please consult the comments in the codes and docstrings for further documentation.

Dependencies & versions

- python (3.6.8, Anaconda distribution has been used)
- numpy (1.15.4)
- scipy (1.2.0)
- pandas (0.24.1)
- sklearn (0.20.2)
- shap (0.28.3)
- statsmodels (0.9.0)
- matplotlib (3.0.2)
- patsy (0.5.1)

Data & sources

Data description:

- Quarterly marcoeconomic time series (UK: 1955Q1-2017Q4, US: 1965Q1-2017Q4).
- Series are either yoy percentage changes or 1st difference (see Table 2 of the paper).
- For the analysis, series are standardised to have mean zero and standard deviation one.
- raw data and standardised series provided.
- series names: GDP, labour productivity, broad money, private non-financial sector debt, 
	unemployment rate, household gross-disposable income, consumer price inflation, 
	central banks main policy rate, current account balance, effective exchange rate.

Individual sources by ID:

- BOE: IUQLBEDR, XUQLBK82, IUQLBEDR, LPQAUYN.
- ONS: D7BT, UKEA,PGDP, PRDY, MGSX.
- BIS: US private sector debt: Q:US:P:A:M:XDC:A
       UK: ERI, GBP/USD (1955 only).
- OECD: US CPI, US M3, US GDP, US Unemployment, US CA.
- FRED: RNUSBIS, FEDFUNDS, PRS85006163, A229RX0:
- A Millennium of UK Data, Ryland Thomas (2017): private sector debt, 
	M4, labour productivity.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].