5 New Kubeflow 1.3 Features that Machine Learning Engineers Will Love
Google’s Kubeflow 1.3 is the latest release of the most popular open source machine learning platform for Kubernetes. It got many new features and enhancements that make machine learning operations (MLOps) easy and accessible.
Here are five features of Kubeflow 1.3 that make the platform better:
1. Simplified Installation
Compared to the previous versions, the latest version of Kubeflow makes the installation extremely simple and straightforward. The shift from Ksonnet to Kustomize becomes evident when you can deploy the entire platform with
kubectl instead of a dedicated tool like
If you have a Kubernetes cluster with a default storage class supporting dynamic provisioning, along with the Kustomize tool, installing Kubeflow is as simple as running a single command.
git clone https://github.com/kubeflow/manifests.git
while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
After a few minutes, you should be able to access the dashboard. But, first, run the following command to port-forward Istio’s Ingress-Gateway to the local port.
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
2. Support for Multiple Development Environments
Kubeflow Notebook Server instances translate to a StatefulSet running in Kubernetes. You can customize the image used for deploying the NoteBook Server. With an IDE environment-specific base image, you can create a Dockerfile with the libraries and modules you need for the development. Then, based on the custom image, you can launch a Notebook Server with the complete environment and tools necessary for your data science experiment.
The above screenshot shows the Code Server powered by VS Code running within Kubeflow. While Jupyter Notebooks are the most popular, having the familiar IDE for developing Python modules is helpful.
Here is a screenshot of RStudio running in Kubeflow:
3. Kubernetes Volume Management from Web UI
Storage and volume management are an important part of MLOps. Shared persistent volumes (RWX) and dedicated volumes (RWO) enable data scientists to easily share datasets and models across multiple stages of the MLOps Pipeline.
For a detailed discussion of choosing the right storage engine for Kubeflow, refer to my previous article.
Earlier versions of Kubeflow left volume management to Kubernetes administrators. Kubeflow 1.3 brought this capability into the web user interface enabling data scientists and developers to create the volumes themselves. This makes volume management an integral part of the platform without the need to learn Kubernetes concepts.
Below is the screenshot of the volume management user interface of the Kubeflow dashboard:
4. TensorBoard Integration with Kubeflow
Kubeflow 1.3 has inbuilt support for TensorBoard, the metrics visualization tool for TensorFlow. For example, while training a model, simply add
%tensorboard --logdir logs/fit to the Notebook to persist the metrics to a directory within the PVC.
To visualize the metrics, create a new TensorBoard and point it to the same directory used within the training code in the Notebook. It is also possible to store the metrics in an object storage bucket. A bucket created in MinIO, the open source, S3 API compliant object storage software, may be used for this purpose.
Below is the screenshot of TensorBoard’s integration with Kubeflow:
5. Multi-Model Serving with KFServing
KFServing, the model serving component of Kubeflow, is optimized for serving multiple models simultaneously. In previous versions, KFServing created a microservice per model which consumed at least 0.5 CPU and 0.5G Memory resource per replica. This approach quickly consumes available cluster resources with the increase in the number of requests.
With multi-model serving, multiple models can be loaded in one InferenceService, then each model’s average overhead is 0.1 CPU and 0.1GB memory. However, the number of GPUs required grows linearly for GPU-based models as the number of models grows, which is not optimal.
KFServing now supports the Nvidia Triton Inference Server that can share the same GPU with multiple models.
The multi-model serving feature helps overcome Kubernetes limitations such as maximum pods per node and a maximum number of IP addresses per cluster. In addition, it maximizes the cluster resources through the new model scheduler and controller.
We will continue to explore Kubeflow features and functionality through several tutorials and guides. Stay tuned.