All Projects → orcaman → nb_to_html

orcaman / nb_to_html

Licence: MIT License
An HTTP endpoint to create on-demand dynamic HTML reports powered by Jupyter Notebooks

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

nb_to_html

nb_to_html is a web application that creates on-demand dynamic HTML reports powered by Jupyter Notebooks.

This is quite a powerful concept as it allows the data scientist to really quickly generate HTML reports that could be sent out as simple URLs.

It does so simply by receiving a path to Notebook to be run (which could be a public or private GitHub repo) and a collection of optional runtimes parameters for the Notebook. Once the notebook finishes execution, its contents is convered to HTML and rendered to the client.

example

Suppose you have a simple notebook that plots data for a stock symbol. In its cleared state, the notebook looks like this: https://github.com/orcaman/stocks_demo/blob/master/stocks_demo.ipynb

This is how you would get an HTML report of the execution result for different stock symbols:

AAPL (Apple): http://localhost:8080/nb_to_html?org=orcaman&repo=stocks_demo&nb_path=stocks_demo.ipynb&params={"target_stock_symbol":"AAPL"}

Alt text

MSFT (Microsoft): http://localhost:8080/nb_to_html?org=orcaman&repo=stocks_demo&nb_path=stocks_demo.ipynb&params={"target_stock_symbol":"MSFT"}

Alt text

For each request, nb_to_html downloads the notebook, sets its parameters, executes it, converts the results to HTML and renders it to the client.

Note the parameters in this case were:

  • org = orcaman (the GitHub organization/user)
  • repo = stocks_demo (repo name on GitHub)
  • nb_path = stocks_demo.ipynb (the full path to the notebook on GitHub)
  • params={"target_stock_symbol":"MSFT"} (the optional params argument containing the name of the variable to inject). Fo this to work, you must tag your notebook cell with "parameters" (see papermill docs)

under-the-hood

Under-the-hood, nb_to_html uses papermill to execute the notebook and nbconvert to convert the result to HTML.

development

In dev, run the app with docker:

docker build -t nb_to_html .
docker run --rm -ti -p 8080:8080 -e GITHUB_TOKEN -e GOOGLE_APPLICATION_CREDENTIALS_JSON -e PORT -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY -e S3_BUCKET -e REDIS_PASSWORD -e REDIS_HOST -e REDIS_PORT nb_to_html

adding dependencies

To add dependencies (e.g. a python library you need in your notebook), either pip install it via a notebook cell like so:

!pip install bokeh

Or, you could add it to the requirements.txt file, that already contains a few commonly used libraries (pandas, numpy, boto3, seaborn, etc.).

server-side caching

The server supports caching the results of notebook execution on redis. If you choose to set up the redis configuruation as explained below, the server will hash the contents of the notebook in combination with the runtime parameters and use this hash as a cache key to use a previous execution.

Env vars / server configuration

All configuration values are optional:

  • GITHUB_TOKEN (optional, a GitHub API access token to access private repos)
  • GOOGLE_APPLICATION_CREDENTIALS_JSON (optional, if your notebook talks to a GCP service like Google BigQuery). Setting this varaible with the contents of the GOOGLE_APPLICATION_CREDENTIALS file will allow you to read those credentials from notebooks without having to store the credentials anywhere else
  • S3_Bucket (optional, an S3 bucket name). If supplied, every report will be saved on the bucket under the following format: s3://<S3_BUCKET>//nb_to_html//////report_name_yyyy_mm_dd_hh:ss.html
  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (optional, AWS credentials if S3 storage is desired)
  • REDIS_PASSWORD, REDIS_HOST and REDIS_PORT if redis caching is desired

And run the app using the docker run command above.

deployment

The application is deployed to AWS Farget. To create a new revision:

  1. Push to ECR:
aws ecr get-login-password --region {REGION} | docker login --username AWS --password-stdin {REPO}.dkr.ecr.{REGION}.amazonaws.com/nb_to_html

docker build -t nb_to_html .

docker tag nb_to_html:latest {REPO}.{REGION}.amazonaws.com/nb_to_html:latest

docker push {REPO}.{REGION}.amazonaws.com/nb_to_html:latest
  1. Create a new revision on Fargate: https://console.aws.amazon.com/ecs/home?region={REGION}#/clusters/default-nb-to-html-server/services (select the service and click on update)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].