All Projects → jupyterhub → Repo2docker Action

jupyterhub / Repo2docker Action

Licence: mit
GitHub Action for repo2docker

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to Repo2docker Action

Python Training
Python training for business analysts and traders
Stars: ✭ 972 (+1004.55%)
Mutual labels:  jupyter-notebook, data-science, jupyter, binder
Nteract
📘 The interactive computing suite for you! ✨
Stars: ✭ 5,713 (+6392.05%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Covid19 Dashboard
A site that displays up to date COVID-19 stats, powered by fastpages.
Stars: ✭ 1,212 (+1277.27%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+825%)
Mutual labels:  data-science, jupyter, datascience
Cookbook 2nd Code
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (+514.77%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Business Machine Learning
A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
Stars: ✭ 575 (+553.41%)
Mutual labels:  jupyter-notebook, jupyter, datascience
Industry Machine Learning
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Stars: ✭ 6,077 (+6805.68%)
Mutual labels:  jupyter-notebook, data-science, datascience
Hands On Nltk Tutorial
The hands-on NLTK tutorial for NLP in Python
Stars: ✭ 419 (+376.14%)
Mutual labels:  jupyter-notebook, jupyter, binder
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+9364.77%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-77.27%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Jupytemplate
Templates for jupyter notebooks
Stars: ✭ 85 (-3.41%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Intro To Python
An intro to Python & programming for wanna-be data scientists
Stars: ✭ 536 (+509.09%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+502.27%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Fastai2
Temporary home for fastai v2 while it's being developed
Stars: ✭ 630 (+615.91%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Or Pandas
【运筹OR帷幄|数据科学】pandas教程系列电子书
Stars: ✭ 492 (+459.09%)
Mutual labels:  jupyter-notebook, jupyter, datascience
Cookbook 2nd
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+700%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+3526.14%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Quantitative Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (+304.55%)
Mutual labels:  jupyter-notebook, data-science, jupyter
Kubeflow Data Science On Steroids
The blog post about Kubeflow, including all materials
Stars: ✭ 25 (-71.59%)
Mutual labels:  jupyter-notebook, data-science, datascience
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-60.23%)
Mutual labels:  jupyter-notebook, data-science, jupyter

MLOps

repo2docker GitHub Action

Trigger repo2docker to build a Jupyter enabled Docker image from your GitHub repository and push this image to a Docker registry of your choice. This will automatically attempt to build an environment from configuration files found in your repository in the manner described here.

Read the full docs on repo2docker for more information: https://repo2docker.readthedocs.io

Images generated by this action are automatically tagged with both latest and <SHA> corresponding to the relevant commit SHA on GitHub. Both tags are pushed to the Docker registry specified by the user. If an existing image with the latest tag already exists in your registry, this Action attempts to pull that image as a cache to reduce uncessary build steps.

What Can I Do With This Action?

  • Use repo2docker to pre-cache images for your own BinderHub cluster, or for mybinder.org.
    • You can use this Action to pre-cache Docker images to a Docker registry that you can reference in your repo. For example if you have the file Dockerfile in the binder/ directory relative to the root of your repository with the following contents, this will allow Binder to start quickly by pulling an image you have already built:

      # This is the image that is built and pushed by this Action (replace this with your image name)
      FROM myorg/myimage:latest
      ...
      
  • Provide a way to Dockerize data science repositories with Jupyter server enabled that you can deploy to VMs, serverless computing or other services that can serve Docker containers as-a-service.
  • Maximize reproducibility by allowing authors, without any prior knowledge of Docker, to build and share containers.

API Reference

See the examples section is very helpful for understanding the inputs and outputs of this Action.

Mandatory Inputs

Exception: if the input parameter NO_PUSH is set to any value, these values become optional.

  • DOCKER_USERNAME: description: Docker registry username
  • DOCKER_PASSWORD: description: Docker registry password or access token. If using DockerHub, we recommend using an access token instead of your password.

Optional Inputs

  • NOTEBOOK_USER: description: username of the primary user in the image. If this is not specified, this is set to joyvan. NOTE: This value is also overriden with jovyan if the parameters BINDER_CACHE or MYBINDERORG_TAG are provided.
  • REPO_DIR: Path inside the image where contents of the repositories are copied to, and where all the build operations (such as postBuild) happen. Defaults to /home/<NOTEBOOK_USER> if not set.
  • IMAGE_NAME: name of the image. Example - myusername/myContainer. If not supplied, this defaults to <DOCKER_USERNAME/GITHUB_REPOSITORY_NAME>.
  • DOCKER_REGISTRY: description: name of the docker registry. If not supplied, this defaults to DockerHub
  • LATEST_TAG_OFF: Setting this variable to any value will prevent your image from being tagged with latest. Note that your image is always tagged with the GitHub commit SHA.
  • ADDITIONAL_TAG: An optional string that specifies the name of an additional tag you would like to apply to the image. Images are already tagged with the relevant GitHub commit SHA.
  • NO_PUSH: Setting this variable to any value will prevent any images from being pushed to a registry. Furthermore, verbose logging will be enabled in this mode. This is disabled by default.
  • BINDER_CACHE: Setting this variable to any value will add the file binder/Dockerfile that references the docker image that was pushed to the registry by this Action. You cannot use this option if the parameter NO_PUSH is set. This is disabled by default.
    • Note: This Action assumes you are not explicitly using Binder to build your dependencies (You are using this Action to build your dependencies). If a directory binder with other files other than Dockerfile or a directory named .binder/ is detected, this step will be aborted. This Action does not support caching images for Binder where dependencies are defined in binder/Dockerfile (if you are defining your dependencies this way, you probably don't need this Action).

      When this parameter is supplied, this Action will add/override binder/Dockerfile in the branch checked out in the Actions runner:

      ### DO NOT EDIT THIS FILE! This Is Automatically Generated And Will Be Overwritten ###
      FROM <IMAGE_NAME>
      
  • COMMIT_MSG: The commit message associated with specifying the BINDER_CACHE flag. If no value is specified, the default commit message of Update image tag will be entered.
  • MYBINDERORG_TAG: This the Git branch, tag, or commit that you want mybinder.org to proactively build from your repo. This is useful if you wish to reduce startup time on mybinder.org. Your repository must be public for this work, as mybinder.org only works with public repositories.
  • PUBLIC_REGISTRY_CHECK: Setting this variable to any value will validate that the image pushed to the registry is publicly visible.

Outputs

  • IMAGE_SHA_NAME The name of the docker image, which is tagged with the SHA.
  • PUSH_STATUS: This is false if NO_PUSH is provided or true otherwhise.

Examples

mybinder.org

A very popular use case for this Action is to cache builds for mybinder.org. If you desire to cache builds for mybinder.org, you must specify the argument MYBINDERORG_TAG. Some examples of doing this are below:

Cache builds on mybinder.org

Proactively build your environment on mybinder.org for any branch. Alternatively, you can use using GitHub Actions to build an image for BindHub generally, including mybinder.org.

name: Binder
on: [push]

jobs:
  Create-MyBinderOrg-Cache:
    runs-on: ubuntu-latest
    steps:
    - name: cache binder build on mybinder.org
      uses: jupyterhub/[email protected]
      with:
        NO_PUSH: true
        MYBINDERORG_TAG: ${{ github.event.ref }} # This builds the container on mybinder.org with the branch that was pushed on.

Cache Builds On mybinder.org And Provide A Link

Same example as above, but also comment on a PR with a link to the binder environment. Commenting on the PR is optional, and is included here for informational purposes only. In this example the image will only be cached when the pull request is opened but not if the pull request is updated with subsequent commits.

In this example the image will only be cached when the pull request is opened but not if the pull request is updated with subsequent commits.

name: Binder
on:
  pull_request:
    types: [opened, reopened]

jobs:
  Create-Binder-Badge:
    runs-on: ubuntu-latest
    steps:
    - name: cache binder build on mybinder.org
      uses: jupyterhub/[email protected]
      with:
        NO_PUSH: true
        MYBINDERORG_TAG: ${{ github.event.pull_request.head.ref }}

    - name: comment on PR with Binder link
      uses: actions/[email protected]
      with:
        github-token: ${{secrets.GITHUB_TOKEN}}
        script: |
          var BRANCH_NAME = process.env.BRANCH_NAME;
          github.issues.createComment({
            issue_number: context.issue.number,
            owner: context.repo.owner,
            repo: context.repo.repo,
            body: `[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/${context.repo.owner}/${context.repo.repo}/${BRANCH_NAME}) 👈 Launch a binder notebook on this branch`
          })
      env:
        BRANCH_NAME: ${{ github.event.pull_request.head.ref }}

Use GitHub Actions To Cache The Build For BinderHub

Instead of forcing mybinder.org to cache your builds, you can optionally build a Docker image with GitHub Actions and push that to a Docker registry, so that any BinderHub instance, including mybinder.org only has to pull the image. This might give you more control than triggering a build directly on mybinder.org like the method illustrated above. In this example, you must supply the secrets DOCKER_USERNAME and DOCKER_PASSWORD so that Actions can push to DockerHub. Note that, instead of your actual password, you can use an access token — which may be a more secure option.

In this case, we set BINDER_CACHE to true to enable this option. See the documentation for the parameter BINDER_CACHE in the Optional Inputs section for more information.

name: Test
on: push

jobs:
  binder:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout Code
      uses: actions/[email protected]
      with:
        ref: ${{ github.event.pull_request.head.sha }}

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/[email protected]
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        BINDER_CACHE: true
        PUBLIC_REGISTRY_CHECK: true

Push Repo2Docker Image To DockerHub

We recommend creating a personal access token and use that as DOCKER_PASSWORD instead of using your dockerhub password.

name: Build Notebook Container
on: [push] # You may want to trigger this Action on other things than a push.
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/[email protected]

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/[email protected]
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}

Push image to quay.io

DockerHub now has some pretty strong rate limits, so you might want to push to a different docker repository. quay.io is a popular place, and isn't tied to any particular cloud vendor.

  1. Login to quay.io

  2. Create a new repository. This will determine the name of your image, and you will push / pull from it. Your image name will be quay.io/<username>/<repository-name>.

  3. Go to your account settings (under your name in the top right), and select the 'Robot Accounts' option on the left menu.

  4. Click 'Create Robot account', give it a memorable name (such as <hub-name>_image_builder) and click 'Create'

  5. In the next screen, select the repository you just created in step (2), and give the robot account Write permission to the repository.

  6. Once done, click the name of the robot account again. This will give you its username and password.

  7. Create these GitHub secrets for your repository with the credentials from the robot account:

    1. QUAY_USERNAME: user name of the robot account
    2. QUAY_PASSWORD: password of the robot account
  8. Use the following config for your github action.

    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
    
        - name: checkout files in repo
          uses: actions/[email protected]
    
        - name: update jupyter dependencies with repo2docker
          uses: jupyterhub/[email protected]
          with: # make sure username & password/token matches your registry
            DOCKER_USERNAME: ${{ secrets.QUAY_USERNAME }}
            DOCKER_PASSWORD: ${{ secrets.QUAY_PASSWORD }}
            DOCKER_REGISTRY: "quay.io"
            IMAGE_NAME: "<quay-username>/<repository-name>"
    
    

Push Image To A Registry Other Than DockerHub

name: Build Notebook Container
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/[email protected]

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/[email protected]
      with: # make sure username & password/token matches your registry
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        DOCKER_REGISTRY: "gcr.io"

Change Image Name

When you do not provide an image name your image name defaults to DOCKER_USERNAME/GITHUB_REPOSITORY_NAME. For example if the user hamelsmu tried to run this Action from this repo, it would be named hamelsmu/repo2docker-action. However, sometimes you may want a different image name, you can accomplish by providing the IMAGE_NAME parameter as illustrated below:

name: Build Notebook Container
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/[email protected]

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/[email protected]
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        IMAGE_NAME: "hamelsmu/my-awesome-image" # this overrides the image name

Test Image Build

You might want to only test the image build withtout pusing to a registry, for example to test a pull request. You can do this by specifying any value for the NO_PUSH parameter:

name: Build Notebook Container
on: [pull_request]
jobs:
  build-image-without-pushing:
    runs-on: ubuntu-latest
    steps:  
    - name: Checkout PR
      uses: actions/[email protected]
      with:
        ref: ${{ github.event.pull_request.head.sha }}

    - name: test build
      uses: jupyterhub/[email protected]
      with:
        NO_PUSH: 'true'
        IMAGE_NAME: "hamelsmu/repo2docker-test"

When you specify a value for the NO_PUSH parameter, you can omit the otherwhise mandatory parameters DOCKER_USERNAME and DOCKER_PASSWORD.

Contributing To repo2docker-action

See the Contributing Guide.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].