Data Science Jupyter Lab

Purpose of this quickstarter

Provision a shared Jupyter Lab within OpenShift for rapid prototyping of data science applications using OpenShift OAuth.

What files / architecture is generated?

.
├── Jenkinsfile
├── .pre-commit-config.yaml
├── docker
│   ├── Dockerfile
│   ├── jupyter_lab_config.json
│   ├── requirements.txt
│   └── run.sh
├── metadata.yml - Component metadata
└── release-manager.yml - Configuration file for the Release Manager

Frameworks used

Usage - how do you start after you provisioned this quickstarter

The quickstarter sets up two pods in OpenShift. The ds-jupyter-lab instance is routed through the OpenShift OAuth proxy instance.

The directory /opt/app-root/src/work is created where code can be organized using installed git.
Please consider mounting a persistent volume claim for this path.
New python requirements are specified using the requirements.txt.

Setting up independent environments/kernels

One can setup specific and independent IPython kernels based on specific Python virtual environments:

  • Open a new terminal session in your Jupyter Lab, then:

cd <PATH_WHERE_TO_LOCATE_NEW_VENV_OR_NONE>
python -m venv <YOUR NEW VENV NAME>
. <YOUR NEW VENV NAME>/bin/activate
pip install ipykernel pip --upgrade
python -m ipykernel install --user --name=<YOUR NEW VENV NAME>
jupyter kernelspec list  # this is for validating installation

Now on a notebook you can select that new kernel by clicking on the name you see on the top right where you see the dot status.

Metadata

The following are typical metadata values that can be used for components based on this quickstarter: Note that the OpenShift resources will be labeled based on this metadata.

name: jupyterlab
description: "JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data."
supplier: https://jupyter.org/
version: 3.0.14
type: ods-service

How this quickstarter is built through jenkins

The build pipeline is defined in the Jenkinsfile in the project root. The main stages of the pipeline are:

  1. Start OpenShift build

  2. Deploy image to OpenShift

include::partial$secret-scanning-with-gitleaks.adoc

Builder agent used

Known limitations

Consider if sufficient computing resources can be provided by the OpenShift cluster.
You might require installing NodeJS if requiring specific JupyterLab extensions (nodejs >=12.0.0).