TNS
VOXPOP
How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
0%
No change in plans, though we will keep an eye on the situation.
0%
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
0%
What recent turmoil?
0%

Install and Configure MinIO as a Model Registry on RKE2

This tutorial is the latest part of a series where we build an end-to-end stack to perform machine learning inference at the edge, built on the SUSE Rancher RK2 Kubernetes distribution. It builds on the our earlier instruction, which focused on deploying the NVIDIA GPU operator.
Nov 26th, 2021 6:00am by
Featued image for: Install and Configure MinIO as a Model Registry on RKE2

This tutorial is the latest part of a series where we build an end-to-end stack to perform machine learning inference at the edge, built on the SUSE Rancher RK2 Kubernetes distribution. It builds on the our earlier instruction, which focused on deploying the Nvidia GPU operator, forming the foundation of the stack. Refer to the that tutorial if you plan to run Nvidia Triton Inference Server on a GPU-powered host. For CPU-based inference, you can follow the steps mentioned in this guide.

The Nvidia Triton Inference Server is a model server for AI inference for models from frameworks such as TensorFlow, Nvidia TensorRT, PyTorch, ONNX, and XGBoost. It relies on object storage services like Amazon Web Services‘ S3 and Google Cloud Storage for hosting the models.

Since we plan to run the inference at the edge, we will utilize MinIO as an S3-compatible object storage service.

We will have a fully configured MinIO object storage service running on an RKE2 cluster by the end of this tutorial.

Step 1 – Install RKE2 on Ubuntu 20.04

SSH into the instance and create the file /etc/rancher/rke2/config.yaml with the below contents:

We pass parameters to the Kubernetes Controller Manager to enable the cluster signing of certificates. MinIO relies on internal Kubernetes TLS certificate management API to create signed TLS certificates.

The Operator cannot complete initialization if the Kubernetes cluster is not configured to respond to a generated CSR. Please ensure that the kube-controller-manager of RKE2 is enabled by passing the parameters mentioned in the above configuration file.

Don’t forget to replace the tls-san section with the hostname, internal IP, and the external IP address of the GCE instance.

Download and run the install script for RKE2. Once it’s done, activate and enable the service to start at boot time.

Add the directory containing the Kubernetes binaries to the path, and run the kubectl command to check the server’s status.

Step 2 – Install RKE2 on Ubuntu 20.04

MinIO requires a storage class that supports dynamic provisioning. Since this tutorial is based on a single host, we will deploy the Local Path Provisioner from Rancher.

If you deploy MinIO on a multi-node cluster, you may consider an overlay storage layer such as Longhorn or Portworx. When running on bare metal servers with high-performance SSD and NVMe disks, MinIO recommends using its own CSI driver called DirectCSI.


kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

This creates a new storage class that the MinIO operator can use.

Command line response to creating a new storage class

Step 3 – Deploy MinIO Operator

We are now ready to install the MinIO operator on the RKE2 cluster.

Run the below commands to download the binary and use it to initialize the operator.

The command kubectl minio init deploys the operator in the minio-operator namespace.

It also creates a couple of services that expose the endpoints for API and the console.

exposed end points for APIs

Running the command kubectl minio proxy -n minio-operator shows the JWT token and opens a tunnel to access the dashboard.

Tenant GUI

Step 4 – Configuring the Model Registry as the Tenant

With the MinIO operator in place, we are ready to configure the tenant accounts. For this use case, the tenant account acts as the model registry for Triton Inference Server.

Run the below commands to create the namespace and deploy the tenant in that. Notice that we are pointing the tenant to the local-path storage class.

create a model registry.

Make a note of the username and password. You will not be able to retrieve them later.

The MinIO service represents the object storage endpoint while the Console service is meant for the dashboard.

Access the dashboard after enabling port-forwarding through kubectl.

kubectl port-forward service/model-registry-console -n model-registry 9443:9443

Enter the username and password shown during the creation of the tenant.

Tenant creation console

Step 5 – Configuring the MinIO CLI to access the Tenant

Download the MinIO client for your operating system. For macOS, run the following command:

brew install minio/stable/mc

Next, let’s patch the MinIO service to turn that into a NodePort service from a ClusterIP service. This will make it easy for us to access the endpoint.

kubectl -n model-registry patch svc minio -p '{"spec": {"type": "NodePort"}}'

We can now point the mc CLI to the endpoint.

mc alias set model-registry https://$HOSTIP:$NODEPORT admin 7c5c084d-9e8e-477b-9a2c-52bbf22db9af --api S3v4

Set the $HOSTIP and NODEPORT variables appropriately.

Let’s create a bucket and list it with the client. We need to provide the --insecure switch to access the service.

create a bucket.

You can also access the same from the console.

We are now ready to use MinIO running on RKE2 as the model registry for Nvidia Triton Inference Server.

Bucket console

In the last and final installment, I will walk you through the steps for deploying and configuring the model server backed by MinIO. Stay tuned.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.