All Projects → laura-rieger → deep-explanation-penalization

laura-rieger / deep-explanation-penalization

Licence: MIT license
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to deep-explanation-penalization

hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Stars: ✭ 110 (+0%)
Mutual labels:  ml, interpretability, feature-importance, explainable-ai, explainability
responsible-ai-toolbox
This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Stars: ✭ 615 (+459.09%)
Mutual labels:  ml, fairness, explainable-ai, explainability, fairness-ml
ProtoTree
ProtoTrees: Neural Prototype Trees for Interpretable Fine-grained Image Recognition, published at CVPR2021
Stars: ✭ 47 (-57.27%)
Mutual labels:  interpretability, interpretable-deep-learning, explainable-ai, explainability
zennit
Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
Stars: ✭ 57 (-48.18%)
Mutual labels:  interpretability, explainable-ai, explainability
concept-based-xai
Library implementing state-of-the-art Concept-based and Disentanglement Learning methods for Explainable AI
Stars: ✭ 41 (-62.73%)
Mutual labels:  interpretability, explainable-ai, explainability
Interpret
Fit interpretable models. Explain blackbox machine learning.
Stars: ✭ 4,352 (+3856.36%)
Mutual labels:  interpretability, explainable-ai, explainability
Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
Stars: ✭ 484 (+340%)
Mutual labels:  interpretability, explainable-ai, explainability
Awesome Machine Learning Interpretability
A curated list of awesome machine learning interpretability resources.
Stars: ✭ 2,404 (+2085.45%)
Mutual labels:  fairness, interpretability, interpretable-deep-learning
mllp
The code of AAAI 2020 paper "Transparent Classification with Multilayer Logical Perceptrons and Random Binarization".
Stars: ✭ 15 (-86.36%)
Mutual labels:  interpretability, explainable-ai, explainability
transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
Stars: ✭ 861 (+682.73%)
Mutual labels:  interpretability, explainable-ai
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (+76.36%)
Mutual labels:  ml, interpretability
mindsdb server
MindsDB server allows you to consume and expose MindsDB workflows, through http.
Stars: ✭ 3 (-97.27%)
Mutual labels:  ml, explainable-ai
Xai
XAI - An eXplainability toolbox for machine learning
Stars: ✭ 596 (+441.82%)
Mutual labels:  ml, interpretability
Mindsdb
Predictive AI layer for existing databases.
Stars: ✭ 4,199 (+3717.27%)
Mutual labels:  ml, explainable-ai
self critical vqa
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
Stars: ✭ 39 (-64.55%)
Mutual labels:  interpretable-deep-learning, explainable-ai
yggdrasil-decision-forests
A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.
Stars: ✭ 156 (+41.82%)
Mutual labels:  ml, interpretability
PolyphonicPianoTranscription
Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
Stars: ✭ 146 (+32.73%)
Mutual labels:  convolutional-neural-network, recurrent-neural-network
thermostat
Collection of NLP model explanations and accompanying analysis tools
Stars: ✭ 126 (+14.55%)
Mutual labels:  interpretability, explainability
adaptive-wavelets
Adaptive, interpretable wavelets across domains (NeurIPS 2021)
Stars: ✭ 58 (-47.27%)
Mutual labels:  interpretability, explainability
ALPS 2021
XAI Tutorial for the Explainable AI track in the ALPS winter school 2021
Stars: ✭ 55 (-50%)
Mutual labels:  interpretability, explainability

Making interpretations useful (CDEP) 🔨

Regularizes interpretations (computed via contextual decomposition) to improve neural networks. Official code for Interpretations are useful: penalizing explanations to align neural networks with prior knowledges (ICML 2020 pdf).

Note: this repo is actively maintained. For any questions please file an issue.

fig_intro

documentation

  • fully-contained data/models/code for reproducing and experimenting with CDEP
  • the src folder contains the core code for running and penalizing contextual decomposition
  • in addition, we run experiments on 4 datasets, each of which are located in their own folders
    • notebooks in these folders show demos for different kinds of text

examples

ISIC skin-cancer classification - using CDEP, we can learn to avoid spurious patches present in the training set, improving test performance!

The segmentation maps of the patches can be downloaded here

ColorMNIST - penalizing the contributions of individual pixels allows us to teach a network to learn a digit's shape instead of its color, improving its test accuracy from 0.5% to 25.1%

Fixing text gender biases - CDEP can help to learn spurious biases in a dataset, such as gendered words

using CDEP on your own data

using CDEP requires two steps:

  1. run CD/ACD on your model. Specifically, 3 things must be altered:
  • the pred_ims function must be replaced by a function you write using your own trained model. This function gets predictions from a model given a batch of examples.
  • the model must be replaced with your model
  • the current CD implementation doesn't always work for all types of networks. If you are getting an error inside of cd.py, you may need to write a custom function that iterates through the layers of your network (for examples see cd.py)
  1. add CD scores to the loss function (see notebooks)

related work

  • ACD (ICLR 2019 pdf, github) - extends CD to CNNs / arbitrary DNNs, and aggregates explanations into a hierarchy
  • PDR framework (PNAS 2019 pdf) - an overarching framewwork for guiding and framing interpretable machine learning
  • TRIM (ICLR 2020 workshop pdf, github) - using simple reparameterizations, allows for calculating disentangled importances to transformations of the input (e.g. assigning importances to different frequencies)
  • DAC (arXiv 2019 pdf, github) - finds disentangled interpretations for random forests

reference

  • feel free to use/share this code openly
  • if you find this code useful for your research, please cite the following:
@inproceedings{rieger2020interpretations,
  title={Interpretations are useful: penalizing explanations to align neural networks with prior knowledge},
  author={Rieger, Laura and Singh, Chandan and Murdoch, William and Yu, Bin},
  booktitle={International Conference on Machine Learning},
  pages={8116--8126},
  year={2020},
  organization={PMLR}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].