Kubernetes / Machine Learning

5 New Kubeflow 1.3 Features that Machine Learning Engineers Will Love

25 Jun 2021 8:00am, by

Google’s Kubeflow 1.3 is the latest release of the most popular open source machine learning platform for Kubernetes. It got many new features and enhancements that make machine learning operations (MLOps) easy and accessible.

Here are five features of Kubeflow 1.3 that make the platform better:

1. Simplified Installation

Compared to the previous versions, the latest version of Kubeflow makes the installation extremely simple and straightforward. The shift from Ksonnet to Kustomize becomes evident when you can deploy the entire platform with kubectl instead of a dedicated tool like kfctl.

If you have a Kubernetes cluster with a default storage class supporting dynamic provisioning, along with the Kustomize tool, installing Kubeflow is as simple as running a single command.

After a few minutes, you should be able to access the dashboard. But, first, run the following command to port-forward Istio’s Ingress-Gateway to the local port.

Refer to my previous tutorials for installing Kubeflow 1.2 on a single GPU host or a hybrid cluster with CPU and GPU hosts. You can follow the same workflow to deploy the latest version of Kubeflow.

2. Support for Multiple Development Environments

With Kubeflow 1.3, you can launch a Notebook Server running Jupyter, Visual Studio Code, or RStudio. This gives the choice of IDEs to data scientists and ML developers.

Kubeflow Notebook Server instances translate to a StatefulSet running in Kubernetes. You can customize the image used for deploying the NoteBook Server. With an IDE environment-specific base image, you can create a Dockerfile with the libraries and modules you need for the development. Then, based on the custom image, you can launch a Notebook Server with the complete environment and tools necessary for your data science experiment.

The above screenshot shows the Code Server powered by VS Code running within Kubeflow. While Jupyter Notebooks are the most popular, having the familiar IDE for developing Python modules is helpful.

Here is a screenshot of RStudio running in Kubeflow:

3. Kubernetes Volume Management from Web UI

Storage and volume management are an important part of MLOps. Shared persistent volumes (RWX) and dedicated volumes (RWO) enable data scientists to easily share datasets and models across multiple stages of the MLOps Pipeline.

For a detailed discussion of choosing the right storage engine for Kubeflow, refer to my previous article.

Earlier versions of Kubeflow left volume management to Kubernetes administrators. Kubeflow 1.3 brought this capability into the web user interface enabling data scientists and developers to create the volumes themselves. This makes volume management an integral part of the platform without the need to learn Kubernetes concepts.

Below is the screenshot of the volume management user interface of the Kubeflow dashboard:

4. TensorBoard Integration with Kubeflow

Kubeflow 1.3 has inbuilt support for TensorBoard, the metrics visualization tool for TensorFlow. For example, while training a model, simply add %tensorboard --logdir logs/fit to the Notebook to persist the metrics to a directory within the PVC.

To visualize the metrics, create a new TensorBoard and point it to the same directory used within the training code in the Notebook. It is also possible to store the metrics in an object storage bucket. A bucket created in MinIO, the open source, S3 API compliant object storage software, may be used for this purpose.

Below is the screenshot of TensorBoard’s integration with Kubeflow:

5. Multi-Model Serving with KFServing

KFServing, the model serving component of Kubeflow, is optimized for serving multiple models simultaneously. In previous versions, KFServing created a microservice per model which consumed at least 0.5 CPU and 0.5G Memory resource per replica. This approach quickly consumes available cluster resources with the increase in the number of requests.

With multi-model serving, multiple models can be loaded in one InferenceService, then each model’s average overhead is 0.1 CPU and 0.1GB memory. However, the number of GPUs required grows linearly for GPU-based models as the number of models grows, which is not optimal.

KFServing now supports the Nvidia Triton Inference Server that can share the same GPU with multiple models.

The multi-model serving feature helps overcome Kubernetes limitations such as maximum pods per node and a maximum number of IP addresses per cluster. In addition, it maximizes the cluster resources through the new model scheduler and controller.

We will continue to explore Kubeflow features and functionality through several tutorials and guides. Stay tuned.