Kubernetes / Machine Learning / Storage

Configure Storage Volumes for Kubeflow Notebook Servers

9 Jul 2021 9:42am, by
This tutorial is the latest installment in an explanatory series on Kubeflow, Google’s popular open source machine learning platform for Kubernetes. Check back each Friday for future installments.

In the last installment of this series, we have created custom Docker images for provisioning Jupyter Notebook Servers targeting the data preparation, training, and inference stages of a machine learning project.

Before we launch the fully customized environments for data scientists and ML developers, we need to make sure that the storage is configured to facilitate collaboration among teams.

This tutorial focuses on provisioning the storage backend for Jupyter Notebook Servers running in the Kubeflow Kubernetes machine learning operations (MLops) platform.

There are two prerequisites for configuring storage for Kubeflow:

    1. A storage class that supports dynamic provisioning
    2. A storage backend with support for shared volumes

While the first prerequisite is met by most of the overlay storage choices available for Kubernetes, only a few support the second.

As discussed in the earlier parts, Kubeflow has a unique requirement of shared volumes and dedicated volumes to run MLOps pipelines. Shared volumes make it possible for multiple pods to read and write simultaneously. Dedicated volumes are traditional Kubernetes persistent volume claims (PVCs) mounted in a single pod.

This tutorial will show how Portworx and NFS provisioner can be used to configure the storage volumes. For detailed instructions on deploying and configuring Kubeflow storage, refer to the DeepOps guide for NFS and Portworx. You can use one of these storage providers as the preferred overlay storage backend for Notebook Servers.

Configure NFS-Based Shared Volumes for Kubeflow

When you install the NFS client provisioner through NVIDIA DeepOps, you get a storage class by name nfs-client which supports shared volumes.

Let’s create two shared PVCs for sharing the dataset and models.

This PVC will be used for performing the ETL operations and data pre-processing.

The next one that we create would be used for storing the model artifacts.

Configure Portworx-Based PVCs for Kubeflow

Portworx supports shared volumes through the SharedV4 storage class. Let’s start by creating the storage class.

Now, we can define the PVCs based on the above storage class.

The below YAML specification creates the PVC for storing the model artifacts:

Use the Kubeflow 1.3 Volumes UI to Configure PVCs

If the storage class is already created, we can also configure the PVCs through the new volumes UI introduced in Kubeflow 1.3.

Notice how the dropdown list picks up the storage classes and gives us a choice of access modes.

Apart from the shared storage volumes, we also need a dedicated volume (RWO) that will be mounted as the home directory of the Notebook Server. When the Notebook Servers are launched, the volume dashboard shows all the bound PVCs.

With the custom Docker container images and storage volumes in place, we are all set to launch the Notebook Servers for data preparation, training, and inference.

In the next part of this series, we will configure the first Notebook Server that performs the ETL and data pre-processing job. Stay tuned!

The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.

A newsletter digest of the week’s most important stories & analyses.