All Projects → Sheeba-Samuel → ProvBook

Sheeba-Samuel / ProvBook

Licence: other
The provenance of a Jupyter Notebook

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
typescript
32286 projects
CSS
56736 projects
HTML
75241 projects

Projects that are alternatives of or similar to ProvBook

Rdflib
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
Stars: ✭ 1,584 (+5362.07%)
Mutual labels:  rdf, rdflib
joseki
Pure Go library for working with RDF, a powerful framework for representing informations as graphs.
Stars: ✭ 27 (-6.9%)
Mutual labels:  rdf, rdflib
rdflib-hdt
A Store back-end for rdflib to allow for reading and querying HDT documents
Stars: ✭ 18 (-37.93%)
Mutual labels:  rdf, rdflib
ProvToolbox
Java library to create and convert W3C PROV data model representations
Stars: ✭ 62 (+113.79%)
Mutual labels:  rdf, provenance
GeoTriples
Publishing Big Geospatial data as Linked Open Geospatial Data
Stars: ✭ 32 (+10.34%)
Mutual labels:  rdf
pyLDAPI
A very small module to add Linked Data API functionality to a Python Flask installation
Stars: ✭ 28 (-3.45%)
Mutual labels:  rdf
Islandora-Metadata-Interest-Group
The purpose of the Islandora Metadata Interest Group (IMIG) is to investigate and provide metadata solutions that help improve metadata creation, maintenance and enhancement in Islandora.
Stars: ✭ 29 (+0%)
Mutual labels:  rdf
wdumper
Tool for generating filtered Wikidata RDF exports
Stars: ✭ 25 (-13.79%)
Mutual labels:  rdf
cubiql
CubiQL: A GraphQL service for querying multidimensional Linked Data Cubes
Stars: ✭ 40 (+37.93%)
Mutual labels:  rdf
ferenda
Transform unstructured document collections to structured Linked Data
Stars: ✭ 22 (-24.14%)
Mutual labels:  rdf
SEPA
Get notifications about changes in your SPARQL endpoint.
Stars: ✭ 21 (-27.59%)
Mutual labels:  rdf
titanium-json-ld
A JSON-LD 1.1 Processor & API
Stars: ✭ 79 (+172.41%)
Mutual labels:  rdf
stardog-language-servers
Language Servers for Stardog Languages
Stars: ✭ 19 (-34.48%)
Mutual labels:  rdf
twinql
A graph query language for the semantic web
Stars: ✭ 17 (-41.38%)
Mutual labels:  rdf
solr-ontology-tagger
Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri
Stars: ✭ 36 (+24.14%)
Mutual labels:  rdf
rdfa-streaming-parser.js
A fast and lightweight streaming RDFa parser for JavaScript
Stars: ✭ 15 (-48.28%)
Mutual labels:  rdf
YALC
🕸 YALC: Yet Another LOD Cloud (registry of Linked Open Datasets).
Stars: ✭ 14 (-51.72%)
Mutual labels:  rdf
rdf2x
RDF2X converts big RDF datasets to the relational database model, CSV, JSON and ElasticSearch.
Stars: ✭ 43 (+48.28%)
Mutual labels:  rdf
ont-api
ONT-API (OWL-API over Apache Jena)
Stars: ✭ 20 (-31.03%)
Mutual labels:  rdf
jekyll-rdf
📃 A Jekyll plugin to include RDF data in your static site or build a complete site for your RDF graph
Stars: ✭ 46 (+58.62%)
Mutual labels:  rdf

ProvBook: Provenance of the Notebook.

ProvBook is an extension of the Jupyter Notebook. It provides features to capture and display the provenance of Jupyter Notebook executions, download the notebook in machine-readable format along with the provenance information and compare the input and output of each cell in different runs.

ProvBook provides three features:

  1. Provenance of Jupyter Notebook: Tracks and stores the provenance of a Jupyter Notebook execution.
  2. Machine-Readability of Jupyter Notebook: Provides the feature to download the notebooks in a machine-readable format.
  3. Diff of Jupyter Notebook Runs: Compare the results of different executions of a Jupyter Notebook code cell along with the input.

Demo

A video showing the installation and use of ProvBook with an example is available here.

Publication

ProvBook: Provenance-based Semantic Enrichment of Interactive Notebooks for Reproducibility, Sheeba Samuel and Birgitta König-Ries, The 17th International Semantic Web Conference (ISWC) 2018 Demo Track

A Provenance-based Semantic Approach to Support Understandability, Reproducibility, and Reuse of Scientific Experiments

Installation

Prerequisite: Jupyter Notebook

Install provbook with pip:

pip install provbook

After the installation, start the Jupyter notebook and you will see the ProvBook icons added in the toolbar as shown below.

Provenance of a code cell

ProvBook

Provenance of Jupyter Notebook

This module tracks the provenance of the Jupyter Notebook. It captures and stores the provenance of the run/execution of the cells over the course of time. Every time the notebook is executed, the provenance of the execution is stored in the metadata of the cell. Every cell is extended with a provenance area with a slider. The provenance area shows the history of the execution of each code cell. The provenance information of the cell execution includes the start and end time of each execution, total number of runs, the total time it took to run the code cell, the source code and the output got during that particular execution. It also shows the provenance of text cells where it displays the modified time and the source. ProvBook

ProvBook icons are added in the toolbar for displaying the provenance of selected or all cells and the provenance difference of executions of cells.

ProvBook also adds a provenance menu in the Jupyter Notebook interface.

Provenance Menu

A user can toggle the provenance display for a selected cell from Cell -> Provenance -> Toggle visibility (selected). A user can clear the provenance data from the metadata of the notebook from Cell -> Provenance -> Clear (all).

Machine-Readability of Jupyter Notebook

This module provides the user the feature to download the notebooks in a machine-readable format. It provides the user the ability to convert the notebooks into RDF (Resource Description Framework)along with the provenance traces and execution environment attributes. This helps to semantically represent the provenance information of notebook execution. This is a command-line utility which takes a notebook as input and generates the RDF Turtle file. The RDF is generated using the REPRODUCE-ME ontology extended from W3C standard PROV-O and the P-Plan ontology. The RDF generated from the notebook can be converted back to a Jupyter Notebook. The notebook can be downloaded as RDF from the Notebook interface.

Example usage of notebook_rdf from command line

Convert your notebook to RDF

notebook_rdf your_notebook.ipynb

or

notebook_rdf --from notebook your_notebook.ipynb --to RDF

Convert your RDF to notebook

notebook_rdf notebook_rdf.ttl

or

notebook_rdf --from RDF notebook_rdf.ttl --to notebook

The notebook can also be downloaded as RDF from the File Menu -> Download as -> RDF (.ttl). Download notebook as a Turtle document

Diff of Jupyter Notebook Runs

This module helps users to compare the results of different executions of a Jupyter Notebook. The user is provided with a dropdown to select two executions based on the starting time of the executions. The users can select the original experimenter’s execution with their own execution of the Jupyter Notebook as well. ProvBookDiff Selection When the user selects the two executions, the difference in the input and the output of these executions are shown side by side. ProvBookDiff If there are differences in the input or output, the difference is highlighted for the user to distinguish the change. This module is based on the nbdime from the Project Jupyter. It extends the nbdime tool and calls the API from nbdime to see the difference between the provenance of each execution of a notebook code cell.

Internals

The provenance is stored in the metadata of the notebook. Every time a code cell is executed, a new entry 'provenance' is added to the metadata of the code cell. The start and end time of the execution is added with the time it took to execute. The source and the output obtained from executing the cell is added to the metadata so that it can be shared with other collaborators to verify the output. The ProvBookDiff is based on the nbdime provided by Jupyter Notebook Development team.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].