All Projects → AndreasMadsen → Python Textualheatmap

AndreasMadsen / Python Textualheatmap

Licence: mit
Create interactive textual heat maps for Jupiter notebooks

Projects that are alternatives of or similar to Python Textualheatmap

Carnd Camera Calibration
Images and notebook for camera calibration
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Keras Segmentation Deeplab V3.1
An awesome semantic segmentation model that runs in real time
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Programming With Data
🐍 Learn Python and Pandas from the ground up
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Ipystata
Enables the use of Stata together with Python via Jupyter (IPython) notebooks.
Stars: ✭ 154 (-1.28%)
Mutual labels:  jupyter-notebook
Fastai audio
[DEPRECATED] 🔊️ Audio with fastaiv1
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Altair notebooks
Tutorial and Examples Jupyter Notebooks for Altair
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Ml Training Advanced
Materials for the "Advanced Scikit-learn" class in the afternoon
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Zigzag
Python library for identifying the peaks and valleys of a time series.
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Anomaliesinoptions
In this notebook we will explore a machine learning approach to find anomalies in stock options pricing.
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Courseraml
I took Andrew Ng's Machine Learning course on Coursera and did the homework assigments... but, on my own in python because I love jupyter notebooks!
Stars: ✭ 1,911 (+1125%)
Mutual labels:  jupyter-notebook
Datagene
DataGene - Identify How Similar TS Datasets Are to One Another (by @firmai)
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Corus
Links to Russian corpora + Python functions for loading and parsing
Stars: ✭ 154 (-1.28%)
Mutual labels:  jupyter-notebook
Fairseq Zh En
NMT for chinese-english using fairseq
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Deep Q Learning
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Pastas
🍝 Pastas is an open-source Python framework for the analysis of hydrological time series.
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Py Quantmod
Powerful financial charting library based on R's Quantmod | http://py-quantmod.readthedocs.io/en/latest/
Stars: ✭ 155 (-0.64%)
Mutual labels:  jupyter-notebook
Coms4995 S18
COMS W4995 Applied Machine Learning - Spring 18
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Yolov4 Cloud Tutorial
This repository walks you through how to Build and Run YOLOv4 Object Detections with Darknet in the Cloud with Google Colab.
Stars: ✭ 153 (-1.92%)
Mutual labels:  jupyter-notebook
Nbpresent
next generation slides for Jupyter Notebooks
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook
Deep q learning
This is the Code for "Deep Q Learning - The Math of Intelligence #9" By Siraj Raval on Youtube
Stars: ✭ 156 (+0%)
Mutual labels:  jupyter-notebook

textualheatmap

Create interactive textual heatmaps for Jupiter notebooks.

I originally published this visualization method in my distill paper https://distill.pub/2019/memorization-in-rnns/. In this context, it is used as a saliency map for showing which parts of a sentence are used to predict the next word. However, the visualization method is more general-purpose than that and can be used for any kind of textual heatmap purposes.

textualheatmap works with python 3.6 or newer and is distributed under the MIT license.

Gif of saliency in RNN models

An end-to-end example of how to use the HuggingFace 🤗 Transformers python module to create a textual saliency map for how each masked token is predicted.

Open In Colab

Gif of saliency in BERT models

Install

pip install -U textualheatmap

API

Examples

Example of sequential-charecter model with metadata visible

Open In Colab

from textualheatmap import TextualHeatmap

data = [[
    # GRU data
    {"token":" ",
     "meta":["the","one","of"],
     "heat":[1,0,0,0,0,0,0,0,0]},
    {"token":"c",
     "meta":["can","called","century"],
     "heat":[1,0.22,0,0,0,0,0,0,0]},
    {"token":"o",
     "meta":["country","could","company"],
     "heat":[0.57,0.059,1,0,0,0,0,0,0]},
    {"token":"n",
     "meta":["control","considered","construction"],
     "heat":[1,0.20,0.11,0.84,0,0,0,0,0]},
    {"token":"t",
     "meta":["control","continued","continental"],
     "heat":[0.27,0.17,0.052,0.44,1,0,0,0,0]},
    {"token":"e",
     "meta":["context","content","contested"],
     "heat":[0.17,0.039,0.034,0.22,1,0.53,0,0,0]},
    {"token":"x",
     "meta":["context","contexts","contemporary"],
     "heat":[0.17,0.0044,0.021,0.17,1,0.90,0.48,0,0]},
    {"token":"t",
     "meta":["context","contexts","contentious"],
     "heat":[0.14,0.011,0.034,0.14,0.68,1,0.80,0.86,0]},
    {"token":" ",
     "meta":["of","and","the"],
     "heat":[0.014,0.0063,0.0044,0.011,0.034,0.10,0.32,0.28,1]},
    # ...
],[
    # LSTM data
    # ...
]]

heatmap = TextualHeatmap(
    width = 600,
    show_meta = True,
    facet_titles = ['GRU', 'LSTM']
)
# Set data and render plot, this can be called again to replace
# the data.
heatmap.set_data(data)
# Focus on the token with the given index. Especially useful when
# `interactive=False` is used in `TextualHeatmap`.
heatmap.highlight(159)

Shows saliency with predicted words at metadata

Example of sequential-charecter model without metadata

Open In Colab

When show_meta is not True, the meta part of the data object has no effect.

heatmap = TextualHeatmap(
    facet_titles = ['LSTM', 'GRU'],
    rotate_facet_titles = True
)
heatmap.set_data(data)
heatmap.highlight(159)

Shows saliency without metadata

Example of non-sequential-word model

Open In Colab

format = True can be set in the data object to inducate tokens that are not directly used by the model. This is useful if word or sub-word tokenization is used.

data = [[
{'token': '[CLR]',
 'meta': ['', '', ''],
 'heat': [1, 0, 0, 0, 0, ...]},
{'token': ' ',
 'format': True},
{'token': 'context',
 'meta': ['today', 'and', 'thus'],
 'heat': [0.13, 0.40, 0.23, 1.0, 0.56, ...]},
{'token': ' ',
 'format': True},
{'token': 'the',
 'meta': ['##ual', 'the', '##ually'],
 'heat': [0.11, 1.0, 0.34, 0.58, 0.59, ...]},
{'token': ' ',
 'format': True},
{'token': 'formal',
 'meta': ['formal', 'academic', 'systematic'],
 'heat': [0.13, 0.74, 0.26, 0.35, 1.0, ...]},
{'token': ' ',
 'format': True},
{'token': 'study',
 'meta': ['##ization', 'study', '##ity'],
 'heat': [0.09, 0.27, 0.19, 1.0, 0.26, ...]}
]]

heatmap = TextualHeatmap(facet_titles = ['BERT'], show_meta=True)
heatmap.set_data(data)

Shows saliency in a BERT model, using sub-word tokenization

Citation

If you use this in a publication, please cite my Distill publication where I first demonstrated this visualization method.

@article{madsen2019visualizing,
  author = {Madsen, Andreas},
  title = {Visualizing memorization in RNNs},
  journal = {Distill},
  year = {2019},
  note = {https://distill.pub/2019/memorization-in-rnns},
  doi = {10.23915/distill.00016}
}

Sponsor

Sponsored by NearForm Research.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].