All Projects → suinleelab → path_explain

suinleelab / path_explain

Licence: MIT license
A repository for explaining feature attributions and feature interactions in deep neural networks.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to path explain

ShapleyExplanationNetworks
Implementation of the paper "Shapley Explanation Networks"
Stars: ✭ 62 (-58.94%)
Mutual labels:  interpretable-deep-learning, explainable-ai
ProtoTree
ProtoTrees: Neural Prototype Trees for Interpretable Fine-grained Image Recognition, published at CVPR2021
Stars: ✭ 47 (-68.87%)
Mutual labels:  interpretable-deep-learning, explainable-ai
deep-explanation-penalization
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584
Stars: ✭ 110 (-27.15%)
Mutual labels:  interpretable-deep-learning, explainable-ai
self critical vqa
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
Stars: ✭ 39 (-74.17%)
Mutual labels:  interpretable-deep-learning, explainable-ai
javaAnchorExplainer
Explains machine learning models fast using the Anchor algorithm originally proposed by marcotcr in 2018
Stars: ✭ 17 (-88.74%)
Mutual labels:  explainable-ai
responsible-ai-toolbox
This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Stars: ✭ 615 (+307.28%)
Mutual labels:  explainable-ai
meg
Molecular Explanation Generator
Stars: ✭ 14 (-90.73%)
Mutual labels:  explainable-ai
Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks
Stars: ✭ 124 (-17.88%)
Mutual labels:  explainable-ai
m-phate
Multislice PHATE for tensor embeddings
Stars: ✭ 54 (-64.24%)
Mutual labels:  interpretable-deep-learning
ISeeU
ISeeU: Visually interpretable deep learning for mortality prediction inside the ICU
Stars: ✭ 20 (-86.75%)
Mutual labels:  interpretable-deep-learning
Relational Deep Reinforcement Learning
No description or website provided.
Stars: ✭ 44 (-70.86%)
Mutual labels:  explainable-ai
mllp
The code of AAAI 2020 paper "Transparent Classification with Multilayer Logical Perceptrons and Random Binarization".
Stars: ✭ 15 (-90.07%)
Mutual labels:  explainable-ai
awesome-agi-cocosci
An awesome & curated list for Artificial General Intelligence, an emerging inter-discipline field that combines artificial intelligence and computational cognitive sciences.
Stars: ✭ 81 (-46.36%)
Mutual labels:  explainable-ai
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Stars: ✭ 110 (-27.15%)
Mutual labels:  explainable-ai
Relevance-CAM
The official code of Relevance-CAM
Stars: ✭ 21 (-86.09%)
Mutual labels:  explainable-ai
dlime experiments
In this work, we propose a deterministic version of Local Interpretable Model Agnostic Explanations (LIME) and the experimental results on three different medical datasets shows the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME).
Stars: ✭ 21 (-86.09%)
Mutual labels:  explainable-ai
fast-tsetlin-machine-with-mnist-demo
A fast Tsetlin Machine implementation employing bit-wise operators, with MNIST demo.
Stars: ✭ 58 (-61.59%)
Mutual labels:  explainable-ai
GraphLIME
This is a Pytorch implementation of GraphLIME
Stars: ✭ 40 (-73.51%)
Mutual labels:  explainable-ai
ddsm-visual-primitives
Using deep learning to discover interpretable representations for mammogram classification and explanation
Stars: ✭ 25 (-83.44%)
Mutual labels:  explainable-ai
zennit
Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
Stars: ✭ 57 (-62.25%)
Mutual labels:  explainable-ai

Path Explain

A repository for explaining feature importances and feature interactions in deep neural networks using path attribution methods.

This repository contains tools to interpret and explain machine learning models using Integrated Gradients and Expected Gradients. In addition, it contains code to explain interactions in deep networks using Integrated Hessians and Expected Hessians - methods that we introduced in our most recent paper: "Explaining Explanations: Axiomatic Feature Interactions for Deep Networks". If you use our work to explain your networks, please cite this paper.

@article{janizek2020explaining,
  author  = {Joseph D. Janizek and Pascal Sturmfels and Su-In Lee},
  title   = {Explaining Explanations: Axiomatic Feature Interactions for Deep Networks},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {104},
  pages   = {1-54},
  url     = {http://jmlr.org/papers/v22/20-1223.html}
}

This repository contains two important directories: the path_explain directory, which contains the packages used to interpret and explain machine learning models, and the examples directory, which contains many examples using the path_explain module to explain different models on different data types.

Installation

The easiest way to install this package is by using pip:

pip install path-explain

Alternatively, you can clone this repository to re-run and explore the examples provided.

Compatibility

This package was written to support TensorFlow 2.0 (in eager execution mode) with Python 3. We have no current plans to support earlier versions of TensorFlow or Python.

API

Although we don't yet have formal API documentation, the underlying code does a pretty good job at explaining the API. See the code for generating attributions and interactions to better understand what the arguments to these functions mean.

Examples

For a simple, quick example to get started using this repository, see the example_usage.ipynb notebook in the top-level directory of this repository. It gives an overview of the functionality provided by this repository. For more advanced examples, keep reading on.

Tabular Data using Expected Gradients and Expected Hessians

Our repository can easily be adapted to explain attributions and interactions learned on tabular data.

# other import statements...
from path_explain import PathExplainerTF, scatter_plot, summary_plot

### Code to train a model would go here
x_train, y_train, x_test, y_test = datset()
model = ...
model.fit(x_train, y_train, ...)
###

### Generating attributions using expected gradients
explainer = PathExplainerTF(model)
attributions = explainer.attributions(inputs=x_test,
                                      baseline=x_train,
                                      batch_size=100,
                                      num_samples=200,
                                      use_expectation=True,
                                      output_indices=0)
###

### Generating interactions using expected hessians
interactions = explainer.interactions(inputs=x_test,
                                      baseline=x_train,
                                      batch_size=100,
                                      num_samples=200,
                                      use_expectation=True,
                                      output_indices=0)
###

Once we've generated attributions and interactions, we can use the provided plotting modules to help visualize them. First we plot a summary of the top features and their attribution values:

### First we need a list of strings denoting the name of each feature
feature_names = ...
###

summary_plot(attributions=attributions,
             feature_values=x_test,
             feature_names=feature_names,
             plot_top_k=10)

Heart Disease Summary Plot

Second, we plot an interaction our model has learned between maximum achieved heart rate and gender:

scatter_plot(attributions=attributions,
             feature_values=x_test,
             feature_index='max. achieved heart rate',
             interactions=interactions,
             color_by='is male',
             feature_names=feature_names,
             scale_y_ind=True)

Interaction: Heart Rate and Gender

The model used to generate the above interactions is a two layer neural network trained on the UCI Heart Disease Dataset. Interactions learned by this model were featured in our paper. To learn more about this particular model and the experimental setup, see the notebook used to train and explain the model.

Explaining an NLP model using Integrated Gradients and Integrated Hessians

As discussed in our paper, we can use Integrated Hessians to get interactions in language models. We explain a transformer from the HuggingFace Transformers Repository.

from transformers import DistilBertTokenizer, TFDistilBertForSequenceClassification, \
                         DistilBertConfig, glue_convert_examples_to_features, \
                         glue_processors

# This is a custom explainer to explain huggingface models
from path_explain import EmbeddingExplainerTF, text_plot, matrix_interaction_plot, bar_interaction_plot

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
config = DistilBertConfig.from_pretrained('distilbert-base-uncased', num_labels=num_labels)
model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', config=config)

### Some custom code to fine-tune the model on a sentiment analysis task...
max_length = 128
data, info = tensorflow_datasets.load('glue/sst-2', with_info=True)
train_dataset = glue_convert_examples_to_features(data['train'],
                                                  tokenizer,
                                                  max_length,
                                                  'sst-2)
valid_dataset = glue_convert_examples_to_features(data['validation'],
                                                  tokenizer,
                                                  max_length,
                                                  'sst-2')
...
### we won't include the whole fine-tuning code. See the HuggingFace repository for more.

### Here we define functions that represent two pieces of the model:
### embedding and prediction
def embedding_model(batch_ids):
    batch_embedding = model.distilbert.embeddings(batch_ids)
    return batch_embedding

def prediction_model(batch_embedding):
    # Note: this isn't exactly the right way to use the attention mask.
    # It should actually indicate which words are real words. This
    # makes the coding easier however, and the output is fairly similar,
    # so it suffices for this tutorial.
    attention_mask = tf.ones(batch_embedding.shape[:2])
    attention_mask = tf.cast(attention_mask, dtype=tf.float32)
    head_mask = [None] * model.distilbert.num_hidden_layers

    transformer_output = model.distilbert.transformer([batch_embedding, attention_mask, head_mask], training=False)[0]
    pooled_output = transformer_output[:, 0]
    pooled_output = model.pre_classifier(pooled_output)
    logits = model.classifier(pooled_output)
    return logits
###

### We need some data to explain
for batch in valid_dataset.take(1):
    batch_input = batch[0]

batch_ids = batch_input['input_ids']
batch_embedding = embedding_model(batch_ids)

baseline_ids = np.zeros((1, 128), dtype=np.int64)
baseline_embedding = embedding_model(baseline_ids)
###

### We are finally ready to explain our model
explainer = EmbeddingExplainerTF(prediction_model)
attributions = explainer.attributions(inputs=batch_embedding,
                                      baseline=baseline_embedding,
                                      batch_size=32,
                                      num_samples=256,
                                      use_expectation=False,
                                      output_indices=1)
###

### For interactions, the hessian is rather large so we use a very small batch size
interactions = explainer.interactions(inputs=batch_embedding,
                                      baseline=baseline_embedding,
                                      batch_size=1,
                                      num_samples=256,
                                      use_expectation=False,
                                      output_indices=1)
###

We can plot the learned attributions and interactions as follows. First we plot the attributions:

### First we need to decode the tokens from the batch ids.
batch_sentences = ...
### Doing so will depend on how you tokenized your model!

text_plot(batch_sentences[0],
          attributions[0],
          include_legend=True)

Showing feature attributions in text

Then we plot the interactions:

bar_interaction_plot(interactions[0],
                     batch_sentences[0],
                     top_k=5)

Showing feature interactions in text

If you would rather plot the full matrix of attributions rather than the top interactions in a bar plot, our package also supports this. First we show the attributions:

text_plot(batch_sentences[1],
          attributions[1],
          include_legend=True)

Showing additional attributions

And then we show the full interaction matrix. Here we've zeroed out the diagonals so you can better see the off-diagonal terms.

matrix_interaction_plot(interaction_list[1],
                        token_list[1])

Showing the full matrix of feature interactions

This example - interpreting DistilBERT - was also featured in our paper. You can examine the setup more here. For more examples, see the examples directory in this repository.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].