All Projects → flekschas → jupyter-scatter

flekschas / jupyter-scatter

Licence: Apache-2.0 license
Interactive 2D scatter plot widget for Jupyter Lab and Notebook. Scales to millions of points!

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
javascript
184084 projects - #8 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to jupyter-scatter

jupyter-extensions
Jupyter extensions for SWAN
Stars: ✭ 56 (-67.44%)
Mutual labels:  jupyter-notebook-extension, jupyterlab-extension
jupyterlab-starters
Starter notebooks and directories in JupyterLab
Stars: ✭ 32 (-81.4%)
Mutual labels:  jupyterlab-extension
Jupyterlab voyager
JupyterLab extension visualize data with Voyager
Stars: ✭ 244 (+41.86%)
Mutual labels:  jupyterlab-extension
Jupyterlab templates
Support for jupyter notebook templates in jupyterlab
Stars: ✭ 223 (+29.65%)
Mutual labels:  jupyterlab-extension
aws-iot-analytics-notebook-containers
An extension for Jupyter notebooks that allows running notebooks inside a Docker container and converting them to runnable Docker images.
Stars: ✭ 25 (-85.47%)
Mutual labels:  jupyter-notebook-extension
jupyterlab-link-share
JupyterLab Extension to easily share a link to a running server on Binder
Stars: ✭ 40 (-76.74%)
Mutual labels:  jupyterlab-extension
perspective-python
Python APIs for perspective front end
Stars: ✭ 14 (-91.86%)
Mutual labels:  jupyterlab-extension
Scattertext
Beautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+901.16%)
Mutual labels:  scatter-plot
jupyterlab-kubeflow-kale
JupyterLab extension to provide a Kubeflow specific left area for Notebooks deployment
Stars: ✭ 17 (-90.12%)
Mutual labels:  jupyterlab-extension
knowledgelab
KnowledgeRepo + JupyterLab
Stars: ✭ 46 (-73.26%)
Mutual labels:  jupyterlab-extension
jupyter-notifier
A Browser Extension That Notifies You When Jupyter Notebook Code Cells Terminate
Stars: ✭ 43 (-75%)
Mutual labels:  jupyter-notebook-extension
Interactive-Data-Visualization-with-Python
Present your data as an effective and compelling story
Stars: ✭ 71 (-58.72%)
Mutual labels:  scatter-plot
jupyterlab iframe
View html as an embedded iframe in JupyterLab
Stars: ✭ 91 (-47.09%)
Mutual labels:  jupyterlab-extension
Jupyterlab tensorboard
Tensorboard extension for jupyterlab.
Stars: ✭ 245 (+42.44%)
Mutual labels:  jupyterlab-extension
jupyterlab-vimrc
add a basic vimrc to jupyterlab vim
Stars: ✭ 59 (-65.7%)
Mutual labels:  jupyterlab-extension
Nb black
A simple extension for Jupyter Notebook and Jupyter Lab to beautify Python code automatically using black.
Stars: ✭ 225 (+30.81%)
Mutual labels:  jupyterlab-extension
jupyter-offlinenotebook
Save and load notebooks to local-storage
Stars: ✭ 39 (-77.33%)
Mutual labels:  jupyterlab-extension
jupyterlab discovery
A JupyterLab extension to facilitate the discovery and installation of other extensions
Stars: ✭ 47 (-72.67%)
Mutual labels:  jupyterlab-extension
jupyterlab-custom-css
Add custom CSS rules for JupyterLab
Stars: ✭ 32 (-81.4%)
Mutual labels:  jupyterlab-extension
theme-cookiecutter
A cookiecutter template to help you make new JupyterLab theme extensions
Stars: ✭ 47 (-72.67%)
Mutual labels:  jupyterlab-extension

jupyter-scatter

pypi version build status API docs notebook examples

An interactive scatter plot widget for Jupyter Lab and Notebook
that can handle millions of points and supports view linking.


Feb-01-2021 21-31-44

Why? Imagine trying to explore an embedding space of millions of data points. Besides plotting the space as a 2D scatter, the exploration typically involves three things: First, we want to interactively adjust the view (e.g., via panning & zooming) and the visual point encoding (e.g., the point color, opacity, or size). Second, we want to be able to select/highlight points. And third, we want to compare multiple embeddings (e.g., via animation, color, or point connections). The goal of jupyter-scatter is to support all three requirements and scale to millions of points.

How? Internally, jupyter-scatter uses regl-scatterplot for rendering and ipywidgets for linking the scatter plot to the iPython kernel.

Index

  1. Install
  2. Get Started
  3. API docs
  4. Examples
  5. Development

Install

pip install jupyter-scatter

If you are using JupyterLab <=2:

jupyter labextension install @jupyter-widgets/jupyterlab-manager jupyter-scatter

For a minimal working example, take a look at test-environment.

Get Started

To play with the following examples yourself, open notebooks/get-started.ipynb.

Simplest Example

In the simplest case, you can pass the x/y coordinates to the plot function as follows:

import jscatter
import numpy as np

x = np.random.rand(500)
y = np.random.rand(500)

jscatter.plot(x, y)

Simplest scatter plotexample

Pandas Example

Say your data is stored in a Pandas dataframe like the following:

import pandas as pd

# Just some random float and int values
data = np.random.rand(500, 4)
df = pd.DataFrame(data, columns=['mass', 'speed', 'pval', 'group'])
# We'll convert the `group` column to strings to ensure it's recognized as
# categorical data. This will come in handy in the advanced example.
df['group'] = df['group'].map(lambda c: chr(65 + round(c)), na_action=None)
x y value group
0 0.13 0.27 0.51 G
1 0.87 0.93 0.80 B
2 0.10 0.25 0.25 F
3 0.03 0.90 0.01 G
4 0.19 0.78 0.65 D

You can then visualize this data by referencing column names:

jscatter.plot(data=df, x='mass', y='speed')
Show the resulting scatter plot Pandas scatter plot example

Advanced example

Often you want to customize the visual encoding, such as the point color, size, and opacity.

jscatter.plot(
  data=df,
  x='mass',
  y='speed',
  size=8, # static encoding
  color_by='group', # data-driven encoding
  opacity_by='density', # view-driven encoding
)

Advanced scatter plot example

In the above example, we chose a static point size of 8. In contrast, the point color is data-driven and assigned based on the categorical group value. The point opacity is view-driven and defined dynamically by the number of points currently visible in the view.

Also notice how jscatter uses an appropriate color map by default based on the data type used for color encoding. In this examples, jscatter uses the color blindness safe color map from Okabe and Ito as the data type is categorical and the number of categories is less than 9.

Important: in order for jscatter to recognize categorical data, the dtype of the corresponding column needs to be category!

You can, of course, customize the color map and many other parameters of the visual encoding as shown next.

Functional API Example

The flat API can get overwhelming when you want to customize a lot of properties. Therefore, jscatter provides a functional API that groups properties by type and exposes them via meaningfully-named methods.

scatter = jscatter.Scatter(data=df, x='mass', y='speed')
scatter.selection(df.query('mass < 0.5').index)
scatter.color(by='mass', map='plasma', order='reverse')
scatter.opacity(by='density')
scatter.size(by='pval', map=[2, 4, 6, 8, 10])
scatter.height(480)
scatter.background('black')
scatter.show()

Functional API scatter plot example

When you update properties dynamically, i.e., after having called scatter.show(), the plot will update automatically. For instance, try calling scatter.xy('speed', 'mass')and you will see how the points are mirrored along the diagonal.

Moreover, all arguments are optional. If you specify arguments, the methods will act as setters and change the properties. If you call a method without any arguments it will act as a getter and return the property (or properties). For example, scatter.selection() will return the currently selected points.

Finally, the scatter plot is interactive and supports two-way communication. Hence, if you select some point with the lasso tool and then call scatter.selection() you will get the current selection.

Linking Scatter Plots

To explore multiple scatter plots and have their view, selection, and hover interactions link, use jscatter.link().

jscatter.link([
  jscatter.Scatter(data=embeddings, x='pcaX', y='pcaY', **config),
  jscatter.Scatter(data=embeddings, x='tsneX', y='tsneY', **config),
  jscatter.Scatter(data=embeddings, x='umapX', y='umapY', **config),
  jscatter.Scatter(data=embeddings, x='caeX', y='caeY', **config)
], rows=2)
linked-scatters-480.mp4

See notebooks/linking.ipynb for more details.

Visualize Millions of Data Points

With jupyter-scatter you can easily visualize and interactively explore datasets with millions of points.

In the following we're visualizing 5 million points generated with the Rössler attractor.

points = np.asarray(roesslerAttractor(5000000))
jscatter.plot(points[:,0], points[:,1], height=640)
5M-roessler-attractor-480.mp4

See notebooks/examples.ipynb for more details.


Development

Setting up a development environment

Requirements:

Installation:

git clone https://github.com/flekschas/jupyter-scatter/ jscatter && cd jscatter
conda env create -f environment.yml && conda activate jscatter
pip install -e .

Enable the Notebook Extension:

jupyter nbextension install --py --symlink --sys-prefix jscatter
jupyter nbextension enable --py --sys-prefix jscatter

Note for developers: the --symlink argument on Linux or macOS allows one to modify the JavaScript code in-place. This feature is not available with Windows.

Enable the Lab Extension:

jupyter labextension develop . --overwrite

After Changing Python code: simply restart the kernel.

After Changing JavaScript code: do cd js && npm run build and reload the browser tab.

Setting up a test environment

Go to test-environment and follow the detailed instructions

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].