A scalable scatter plot extension for Jupyter Lab and Notebook
IMPORTANT: THIS IS VERY EARLY WORK! THE API WILL LIKELY CHANGE. Anyway, you're more than welcome to give the extension a try and let me know what you think :) All feedback is welcome!
Why? Imagine trying to explore an embedding space of millions of data points. Besides plotting the space as a 2D scatter, the exploration typically involves three things: First, we want to interactively adjust the view (e.g., via panning & zooming) and the visual point encoding (e.g., the point color, opacity, or size). Second, we want to be able to select/highlight points. And third, we want to compare multiple embeddings (e.g., via animation, color, or point connections). The goal of jupyter-scatter is to support all three requirements and scale to millions of points.
How? Internally, jupyter-scatter uses regl-scatterplot for rendering and ipywidgets for linking the scatter plot to the iPython kernel.
# Install extension
pip install jupyter-scatter
# Activate extension in Jupyter Lab
jupyter labextension install jupyter-scatter
# Activate extension in Jupyter Notebook
jupyter nbextension install --py --sys-prefix jscatter
jupyter nbextension enable --py --sys-prefix jscatter
For a minimal working example, take a look at test-environment.
import jscatter
import numpy as np
# Let's generate some dummy data
points = np.random.rand(500, 2)
values = np.random.rand(500) # optional
categories = (np.random.rand(500) * 10).astype(int) # optional
# Let's plot the data
scatterplot = jscatter.plot(points, categories, values)
scatterplot.show()
To adjust the scatter plot interactively let's pull up some options:
scatterplot.options()
Finally, to retrieve the current selection of points (or programmatically select points) you can work with:
scatterplot.selected_points
For a complete example, take a look at notebooks/example.ipynb
Coming soon!
Setting up a development environment
Requirements:
- Conda >= 4.8
Installation:
git clone https://github.com/flekschas/jupyter-scatter/ jscatter && cd jscatter
conda env create -f environment.yml && conda activate jscatter
pip install -e .
Enable the Notebook Extension:
jupyter nbextension install --py --symlink --sys-prefix jscatter
jupyter nbextension enable --py --sys-prefix jscatter
Enable the Lab Extension:
jupyter labextension develop --overwrite jscatter
After Changing Python code: simply restart the kernel.
After Changing JavaScript code: do cd js && npm run build
and reload the browser tab.