All Projects → iancovert → sage

iancovert / sage

Licence: MIT license
For calculating global feature importance using Shapley values.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to sage

removal-explanations
A lightweight implementation of removal-based explanations for ML models.
Stars: ✭ 46 (-64.34%)
Mutual labels:  interpretability, shapley, explainability
Shap
A game theoretic approach to explain the output of any machine learning model.
Stars: ✭ 14,917 (+11463.57%)
Mutual labels:  interpretability, shapley, explainability
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Stars: ✭ 110 (-14.73%)
Mutual labels:  interpretability, explainability
Awesome Production Machine Learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Stars: ✭ 10,504 (+8042.64%)
Mutual labels:  interpretability, explainability
ProtoTree
ProtoTrees: Neural Prototype Trees for Interpretable Fine-grained Image Recognition, published at CVPR2021
Stars: ✭ 47 (-63.57%)
Mutual labels:  interpretability, explainability
Interpret
Fit interpretable models. Explain blackbox machine learning.
Stars: ✭ 4,352 (+3273.64%)
Mutual labels:  interpretability, explainability
deep-explanation-penalization
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584
Stars: ✭ 110 (-14.73%)
Mutual labels:  interpretability, explainability
zennit
Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
Stars: ✭ 57 (-55.81%)
Mutual labels:  interpretability, explainability
thermostat
Collection of NLP model explanations and accompanying analysis tools
Stars: ✭ 126 (-2.33%)
Mutual labels:  interpretability, explainability
adaptive-wavelets
Adaptive, interpretable wavelets across domains (NeurIPS 2021)
Stars: ✭ 58 (-55.04%)
Mutual labels:  interpretability, explainability
ALPS 2021
XAI Tutorial for the Explainable AI track in the ALPS winter school 2021
Stars: ✭ 55 (-57.36%)
Mutual labels:  interpretability, explainability
mllp
The code of AAAI 2020 paper "Transparent Classification with Multilayer Logical Perceptrons and Random Binarization".
Stars: ✭ 15 (-88.37%)
Mutual labels:  interpretability, explainability
ArenaR
Data generator for Arena - interactive XAI dashboard
Stars: ✭ 28 (-78.29%)
Mutual labels:  interpretability, explainability
concept-based-xai
Library implementing state-of-the-art Concept-based and Disentanglement Learning methods for Explainable AI
Stars: ✭ 41 (-68.22%)
Mutual labels:  interpretability, explainability
Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
Stars: ✭ 484 (+275.19%)
Mutual labels:  interpretability, explainability
contextual-ai
Contextual AI adds explainability to different stages of machine learning pipelines - data, training, and inference - thereby addressing the trust gap between such ML systems and their users. It does not refer to a specific algorithm or ML method — instead, it takes a human-centric view and approach to AI.
Stars: ✭ 81 (-37.21%)
Mutual labels:  explainability
mmn
Moore Machine Networks (MMN): Learning Finite-State Representations of Recurrent Policy Networks
Stars: ✭ 39 (-69.77%)
Mutual labels:  interpretability
responsible-ai-toolbox
This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Stars: ✭ 615 (+376.74%)
Mutual labels:  explainability
meg
Molecular Explanation Generator
Stars: ✭ 14 (-89.15%)
Mutual labels:  interpretability
partial dependence
Python package to visualize and cluster partial dependence.
Stars: ✭ 23 (-82.17%)
Mutual labels:  interpretability

SAGE

SAGE (Shapley Additive Global importancE) is a game-theoretic approach for understanding black-box machine learning models. It summarizes each feature's importance based on the predictive power it contributes, and it accounts for complex feature interactions using the Shapley value.

SAGE was introduced in this paper, but if you're new to using Shapley values you may want to start by reading this blog post.

Install

The easiest way to get started is to install the sage-importance package with pip:

pip install sage-importance

Alternatively, you can clone the repository and install the package in your Python environment as follows:

pip install .

Usage

SAGE is model-agnostic, so you can use it with any kind of machine learning model (linear models, GBMs, neural networks, etc). All you need to do is set up an imputer to handle held out features, and then run a Shapley value estimator:

import sage

# Get data
x, y = ...
feature_names = ...

# Get model
model = ...

# Set up an imputer to handle missing features
imputer = sage.MarginalImputer(model, x[:512])

# Set up an estimator
estimator = sage.PermutationEstimator(imputer, 'mse')

# Calculate SAGE values
sage_values = estimator(x, y)
sage_values.plot(feature_names)

The result will look like this:

Our implementation supports several features to make Shapley value calculation more practical:

  • Uncertainty estimation: confidence intervals are provided for each feature's importance value.
  • Convergence detection: convergence is determined based on the size of the confidence intervals, and a progress bar displays the estimated time until convergence.
  • Model conversion: our back-end requires models that are converted into a consistent format, and this conversion step is performed automatically for XGBoost, CatBoost, LightGBM, sklearn and PyTorch models. If you're using a different kind of model, it needs to be converted to a callable function (see here for examples).

Examples

Check out the following notebooks to get started:

  • Bike: a simple example using XGBoost, shows how to calculate SAGE values and Shapley Effects (an alternative explanation when no labels are available)
  • Credit: generate explanations using a surrogate model to approximate the conditional distribution (using CatBoost)
  • Airbnb: calculate SAGE values with grouped features (using a PyTorch MLP)
  • Bank: a model monitoring example that uses SAGE to identify features that hurt the model's performance (using CatBoost)
  • MNIST: shows strategies to accelerate convergence for datasets with many features (feature grouping, different imputing setups)
  • Consistency: verifies that our various Shapley value estimators (see below) return the same results
  • Calibration: verifies that SAGE's confidence intervals are representative of the uncertainty across runs

If you want to replicate any experiments described in our paper, see this separate repository.

More details

This repository provides some flexibility in how you generate explanations. You can make several choices when generating explanations.

1. Feature removal approach

The original SAGE paper proposes marginalizing out missing features using their conditional distribution. Since this is challenging to implement in practice, several approximations are available. The choices include:

  1. Use default values for missing features (see MNIST for an example). This is a fast but low-quality approximation.
  2. Sample features from the marginal distribution (see Bike for an example). This approximation is discussed in the SAGE paper.
  3. Train a supervised surrogate model (see Credit for an example). This approach is described in this paper, and it can provide a better approximation than the other approaches. However, it requires training an additional model (typically a neural network).

2. Explanation type

Two types of explanations can be calculated, both based on Shapley values:

  1. SAGE. This approach quantifies each feature's role in improving the model's performance (the default explanation here).
  2. Shapley Effects. Described in this paper, this explanation method quantifies the model's sensitivity to each feature. Since Shapley Effects is a variation on SAGE (see details in this paper), our implementation generates this type of explanation when labels are not provided. See the Bike notebook for an example.

3. Shapley value estimator

Shapley values are computationally costly to calculate exactly, so we implemented four different estimators:

  1. Permutation sampling. This is the approach described in the original paper (see PermutationEstimator).
  1. KernelSAGE. This is a linear regression-based estimatorthat is similar to KernelSHAP (see KernelEstimator). It is described in this paper, and the Bank notebook shows an example use-case.
  2. Iterated sampling. This is a variation on the permutation sampling approach where we calculate Shapley values for each feature sequentially (see IteratedEstimator). This permits faster convergence for features with low variance, but it can result in wider confidence intervals.
  3. Sign estimation. This method estimates SAGE values to a lower precision by focusing on their sign (i.e., whether they help or hurt performance). It is implemented in SignEstimator, and the Bank notebook shows an example.

The results from each approach should be identical (see Consistency), but there may be differences in convergence speed. Permutation sampling is a good approach to start with. KernelSAGE may converge a bit faster, but the uncertainty is spread more evenly among the features (rather than being higher for more important features).

4. Grouped features

Rather than removing features individually, you can specify groups of features to be removed together. This will likely speed up convergence because there are fewer feature subsets to consider. See Airbnb for an example.

Authors

References

Ian Covert, Scott Lundberg, Su-In Lee. "Understanding Global Feature Contributions With Additive Importance Measures." NeurIPS 2020

Ian Covert, Scott Lundberg, Su-In Lee. "Explaining by Removing: A Unified Framework for Model Explanation." arxiv preprint:2011.14878

Ian Covert, Su-In Lee. "Improving KernelSHAP: Practical Shapley Value Estimation via Linear Regression." AISTATS 2021

Art Owen. "Sobol' Indices and Shapley value." SIAM 2014

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].