All Projects → srstevenson → nb-clean

srstevenson / nb-clean

Licence: ISC license
Clean Jupyter notebooks of outputs, metadata, and empty cells, with Git integration

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to nb-clean

LearningPath
Learning repository
Stars: ✭ 143 (+98.61%)
Mutual labels:  notebook
tyssue
An epithelium simulation library
Stars: ✭ 50 (-30.56%)
Mutual labels:  notebook
FIW KRT
Families In the WIld: A Kinship Recogntion Toolbox.
Stars: ✭ 18 (-75%)
Mutual labels:  notebook
notesnook
A fully open source & end-to-end encrypted note taking alternative to Evernote.
Stars: ✭ 5,098 (+6980.56%)
Mutual labels:  notebook
text-rnn-tensorflow
Tutorial: Multi-layer Recurrent Neural Networks (LSTM, RNN) for text models in Python using TensorFlow.
Stars: ✭ 22 (-69.44%)
Mutual labels:  notebook
computer-vision-notebooks
👁️ An authorial set of fundamental Python recipes on Computer Vision and Digital Image Processing.
Stars: ✭ 89 (+23.61%)
Mutual labels:  notebook
gg
Git with less typing
Stars: ✭ 55 (-23.61%)
Mutual labels:  version-control
powerbi-vcs
WIP (properly) version control and collaborate on your *.pbi{tx} files
Stars: ✭ 78 (+8.33%)
Mutual labels:  version-control
machine-learning-use-cases
Machine Learning Notebooks with Turicreate and Keras in a Docker Container
Stars: ✭ 20 (-72.22%)
Mutual labels:  notebook
swiftML
Swift library for Machine Learning
Stars: ✭ 56 (-22.22%)
Mutual labels:  notebook
MDAPL
The de facto standard for people who are looking to learn Dyalog APL from a book. This updated version is a work in progress.
Stars: ✭ 24 (-66.67%)
Mutual labels:  notebook
Machine-Learning-Notebooks
15+ Machine/Deep Learning Projects in Ipython Notebooks
Stars: ✭ 66 (-8.33%)
Mutual labels:  notebook
sunrise
NumPy, SciPy, MRI and Music | Presented at ISMRM 2021 Sunrise Educational Session
Stars: ✭ 20 (-72.22%)
Mutual labels:  notebook
go-notebook
Go-Notebook is inspired by Jupyter Project (link) in order to document Golang code.
Stars: ✭ 33 (-54.17%)
Mutual labels:  notebook
Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
Stars: ✭ 143 (+98.61%)
Mutual labels:  notebook
blendgit
manage versions of Blender documents using Git
Stars: ✭ 93 (+29.17%)
Mutual labels:  version-control
OctoPrint-GitFiles
With this plugin, you can use a github/gitlab repository for keeping your OctoPrint Files collection up-to-date.
Stars: ✭ 28 (-61.11%)
Mutual labels:  version-control
AutoVer
Configurable automatic or real time backup and personal versioning system
Stars: ✭ 65 (-9.72%)
Mutual labels:  version-control
pytorch notebooks
A collection of PyTorch notebooks for learning and practicing deep learning
Stars: ✭ 113 (+56.94%)
Mutual labels:  notebook
pro.fessional.wings
WingsBoot=BKB+飞鞋+SpringBoot。其核心价值是:①使团队快速实现业务目标;②快速偿还技术债务;③安全的面向程序和业务重构。
Stars: ✭ 78 (+8.33%)
Mutual labels:  version-control

Licence GitHub release PyPI version Python versions CI status Coverage

nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs, and (optionally) empty cells, preparing them for committing to version control. It provides both a Git filter and pre-commit hook to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines.

⚠️ nb-clean 2.0.0 introduced a new command line interface to make cleaning notebooks in place easier. If you upgrade from a previous release, you'll need to migrate to the new interface as described under Migrating to nb-clean 2.

Installation

To install the latest release from PyPI, use pip:

python3 -m pip install nb-clean

nb-clean can also be installed with Conda:

conda install -c conda-forge nb-clean

In Python projects using Poetry or Pipenv for dependency management, add nb-clean as a development dependency with poetry add --dev nb-clean or pipenv install --dev nb-clean. nb-clean requires Python 3.7 or later.

Usage

Checking

You can check if a notebook is clean with:

nb-clean check notebook.ipynb

or by passing the notebook contents on standard input:

nb-clean check < notebook.ipynb

To also check for empty cells, add the -e/--remove-empty-cells flag. To ignore cell metadata, add the -m/--preserve-cell-metadata flag, optionally with a selection of metadata fields to ignore. To ignore cell outputs, add the -o/--preserve-cell-outputs flag.

nb-clean will exit with status code 0 if the notebook is clean, and status code 1 if it is not. nb-clean will also print details of cell execution counts, metadata, outputs, and empty cells it finds.

Cleaning (interactive)

You can clean a Jupyter notebook with:

nb-clean clean notebook.ipynb

This cleans the notebook in place. You can also pass the notebook content on standard input, in which case the cleaned notebook is written to standard output:

nb-clean clean < original.ipynb > cleaned.ipynb

To also remove empty cells, add the -e/--remove-empty-cells flag. To preserve cell metadata, add the -m/--preserve-cell-metadata flag, optionally with a selection of metadata fields to preserve. To preserve cell outputs, add the -o/--preserve-cell-outputs flag.

Cleaning (Git filter)

To add a filter to an existing Git repository to automatically clean notebooks when they're staged, run the following from the working tree:

nb-clean add-filter

This will configure a filter to remove cell execution counts, metadata, and outputs. To also remove empty cells, use:

nb-clean add-filter --remove-empty-cells

To preserve cell metadata, such as that required by tools such as papermill, use:

nb-clean add-filter --preserve-cell-metadata

To preserve only specific cell metadata, e.g., tags and special, use:

nb-clean add-filter --preserve-cell-metadata tags special

To preserve cell outputs, use:

nb-clean add-filter --preserve-cell-outputs

nb-clean will configure a filter in the Git repository in which it is run, and won't mutate your global or system Git configuration. To remove the filter, run:

nb-clean remove-filter

Cleaning (pre-commit hook)

nb-clean can also be used as a pre-commit hook. You may prefer this to the Git filter if your project already uses the pre-commit framework.

Note that the Git filter and pre-commit hook work differently, with different effects on your working directory. The pre-commit hook operates on the notebook on disk, cleaning the copy in your working directory. The Git filter cleans notebooks as they are added to the index, leaving the copy in your working directory dirty. This means cell outputs are still visible to you in your local Jupyter instance when using the Git filter, but not when using the pre-commit hook.

After installing pre-commit, add the nb-clean hook by adding the following snippet to .pre-commit-config.yaml in the root of your repository:

repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: "2.4.0"
    hooks:
      - id: nb-clean

You can pass additional arguments to nb-clean with an args array. The following example shows how to preserve only two specific metadata fields. Note that, in the example, the final item -- in the arg list is mandatory. The option --preserve-cell-metadata may take an arbitrary number of field arguments, and the -- argument is needed to separate them from notebook filenames, which pre-commit will append to the list of arguments.

repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: "2.4.0"
    hooks:
      - id: nb-clean
        args:
          - --remove-empty-cells
          - --preserve-cell-metadata
          - tags
          - slideshow
          - --

Run pre-commit install to ensure the hook is installed, and pre-commit autoupdate to update the hook to the latest release of nb-clean.

Preserving all nbformat metadata

To ignore or preserve specifically the metadata defined in the nbformat documentation, use the following options: --preserve-cell-metadata collapsed scrolled deletable editable format name tags jupyter execution.

Migrating to nb-clean 2

The following table maps from the command line interface of nb-clean 1.6.0 to that of nb-clean 2.4.0.

Description nb-clean 1.6.0 nb-clean 2.4.0
Clean notebook nb-clean clean -i/--input notebook.ipynb | sponge notebook.ipynb nb-clean clean notebook.ipynb
Clean notebook (remove empty cells) nb-clean clean -i/--input notebook.ipynb -e/--remove-empty nb-clean clean notebook.ipynb -e/--remove-empty-cells
Clean notebook (preserve cell metadata) nb-clean clean -i/--input notebook.ipynb -m/--preserve-metadata nb-clean clean notebook.ipynb -m/--preserve-cell-metadata
Clean notebook (preserve tags and special cell metadata) nb-clean clean notebook.ipynb -m/--preserve-cell-metadata tags special
Clean notebook (preserve cell outputs) nb-clean clean notebook.ipynb -o/--preserve-cell-outputs
Check notebook nb-clean check -i/--input notebook.ipynb nb-clean check notebook.ipynb
Check notebook (ignore non-empty cells) nb-clean check -i/--input notebook.ipynb -e/--remove-empty nb-clean check notebook.ipynb -e/--remove-empty-cells
Check notebook (ignore cell metadata) nb-clean check -i/--input notebook.ipynb -m/--preserve-metadata nb-clean check notebook.ipynb -m/--preserve-cell-metadata
Check notebook (ignore tags and special cell metadata) nb-clean check notebook.ipynb -m/--preserve-cell-metadata tags special
Check notebook (ignore cell outputs) nb-clean check notebook.ipynb -o/--preserve-cell-outputs
Add Git filter to clean notebooks nb-clean configure-git nb-clean add-filter
Remove Git filter nb-clean unconfigure-git nb-clean remove-filter

Copyright

Copyright © 2017-2022 Scott Stevenson.

nb-clean is distributed under the terms of the ISC licence.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].