All Projects → itsjafer → jupyterlab-sparkmonitor

itsjafer / jupyterlab-sparkmonitor

Licence: Apache-2.0 license
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
scala
5932 projects
CSS
56736 projects
Jupyter Notebook
11667 projects
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to jupyterlab-sparkmonitor

Jupyterlab Toc
Table of Contents extension for JupyterLab
Stars: ✭ 660 (+746.15%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab System Monitor
JupyterLab extension to display system metrics
Stars: ✭ 154 (+97.44%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab Lsp
Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
Stars: ✭ 796 (+920.51%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Lantern
Data exploration glue
Stars: ✭ 292 (+274.36%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Best Of Jupyter
🏆 A ranked list of awesome Jupyter Notebook, Hub and Lab projects (extensions, kernels, tools). Updated weekly.
Stars: ✭ 200 (+156.41%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab Dash
An Extension for the Interactive development of Dash apps in JupyterLab
Stars: ✭ 342 (+338.46%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab Topbar
JupyterLab Top Bar extension
Stars: ✭ 86 (+10.26%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
ipylab
Control JupyterLab from Python Notebooks with Jupyter Widgets 🧪 ☢️ 🐍
Stars: ✭ 101 (+29.49%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Awesome Jupyterlab Extension
😎 A curated list of awesome Jupyterlab extension projects. 🌠 Detailed introduction with images.
Stars: ✭ 198 (+153.85%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Awesome Jupyter
A curated list of awesome Jupyter projects, libraries and resources
Stars: ✭ 2,523 (+3134.62%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
jupyterlab-topbar
JupyterLab Top Bar extension
Stars: ✭ 95 (+21.79%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-35.9%)
Mutual labels:  apache-spark, jupyter, pyspark
jupyterlab-python-file
JupyterLab extension to create Python files
Stars: ✭ 50 (-35.9%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Debugger
A visual debugger for Jupyter notebooks, consoles, and source files
Stars: ✭ 476 (+510.26%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
theme-darcula
A handsome Darcula theme for Jupyterlab. The first jlab theme to include dark scrollbars
Stars: ✭ 136 (+74.36%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab Python Bytecode
JupyterLab extension to explore CPython Bytecode
Stars: ✭ 57 (-26.92%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
jupyterlab-heroku
JupyterLab extension to deploy applications to Heroku
Stars: ✭ 20 (-74.36%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
jupyterlab-theme-solarized-dark
JupyterLab 2/3 Solarized Dark extension
Stars: ✭ 61 (-21.79%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab Hub
Deprecated: JupyterLab extension for running JupyterLab with JupyterHub
Stars: ✭ 181 (+132.05%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension
Jupyterlab templates
Support for jupyter notebook templates in jupyterlab
Stars: ✭ 223 (+185.9%)
Mutual labels:  jupyter, jupyterlab, jupyterlab-extension

Spark Monitor - An extension for Jupyter Lab

This project was originally written by krishnan-r as a Google Summer of Code project for Jupyter Notebook. Check his website out here.

As a part of my internship as a Software Engineer at Yelp, I created this fork to update the extension to be compatible with JupyterLab - Yelp's choice for sharing and collaborating on notebooks.

About

+ =
SparkMonitor is an extension for Jupyter Lab that enables the live monitoring of Apache Spark Jobs spawned from a notebook. The extension provides several features to monitor and debug a Spark job from within the notebook interface itself.

jobdisplay

Requirements

  • At least JupyterLab 3
  • pyspark 3.X.X or newer (For compatibility with older pyspark versions, use jupyterlab-sparkmonitor 3.X)

Features

  • Automatically displays a live monitoring tool below cells that run Spark jobs in a Jupyter notebook
  • A table of jobs and stages with progressbars
  • A timeline which shows jobs, stages, and tasks
  • A graph showing number of active tasks & executor cores vs time
  • A notebook server extension that proxies the Spark UI and displays it in an iframe popup for more details
  • For a detailed list of features see the use case notebooks
  • Support for multiple SparkSessions (default port is 4040)
  • How it Works

Quick Start

To do a quick test of the extension

This docker image has pyspark and several other related packages installed alongside the sparkmonitor extension.

docker run -it -p 8888:8888 itsjafer/sparkmonitor

Setting up the extension

pip install jupyterlab-sparkmonitor # install the extension

# set up ipython profile and add our kernel extension to it
ipython profile create --ipython-dir=.ipython
echo "c.InteractiveShellApp.extensions.append('sparkmonitor.kernelextension')" >>  .ipython/profile_default/ipython_config.py

# run jupyter lab
IPYTHONDIR=.ipython jupyter lab --watch

With the extension installed, a SparkConf object called conf will be usable from your notebooks. You can use it as follows:

from pyspark import SparkContext

# start the spark context using the SparkConf the extension inserted
sc=SparkContext.getOrCreate(conf=conf) #Start the spark context

# Monitor should spawn under the cell with 4 jobs
sc.parallelize(range(0,100)).count()
sc.parallelize(range(0,100)).count()
sc.parallelize(range(0,100)).count()
sc.parallelize(range(0,100)).count()

If you already have your own spark configuration, you will need to set spark.extraListeners to sparkmonitor.listener.JupyterSparkMonitorListener and spark.driver.extraClassPath to the path to the sparkmonitor python package path/to/package/sparkmonitor/listener.jar

from pyspark.sql import SparkSession
spark = SparkSession.builder\
        .config('spark.extraListeners', 'sparkmonitor.listener.JupyterSparkMonitorListener')\
        .config('spark.driver.extraClassPath', 'venv/lib/python3.7/site-packages/sparkmonitor/listener.jar')\
        .getOrCreate()

# should spawn 4 jobs in a monitor bnelow the cell
spark.sparkContext.parallelize(range(0,100)).count()
spark.sparkContext.parallelize(range(0,100)).count()
spark.sparkContext.parallelize(range(0,100)).count()
spark.sparkContext.parallelize(range(0,100)).count()

Changelog

  • 1.0 - Initial Release
  • 2.0 - Migration to JupyterLab 2, Multiple Spark Sessions, and displaying monitors beneath the correct cell more accurately
  • 3.0 - Migrate to JupyterLab 3 as prebuilt extension
  • 4.0 - pyspark 3.X Compatibility; no longer compatible with PySpark 2.X or under

Development

If you'd like to develop the extension:

make all # Clean the directory, build the extension, and run it locally
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].