TNS
VOXPOP
Will real-time data processing replace batch processing?
At Confluent's user conference, Kafka co-creator Jay Kreps argued that stream processing would eventually supplant traditional methods of batch processing altogether.
Absolutely: Businesses operate in real-time and are looking to move their IT systems to real-time capabilities.
0%
Eventually: Enterprises will adopt technology slowly, so batch processing will be around for several more years.
0%
No way: Stream processing is a niche, and there will always be cases where batch processing is the only option.
0%
Kubernetes

Getting Started with GPUs in Google Kubernetes Engine

Apr 6th, 2018 8:24am by
Featued image for: Getting Started with GPUs in Google Kubernetes Engine

In the last part of this series, I introduced Nvidia-Docker to access GPUs from containers. In this tutorial, I will walk you through the steps involved in accessing GPUs from Kubernetes.

Google Kubernetes Engine (GKE) is one of the first hosted Kubernetes platforms to offer GPUs to customers. Based on Nvidia Tesla K80 and P100 GPUs, GKE makes it possible to run containerized machine learning jobs, image processing, and financial modeling at scale in the cloud. The feature is currently available in Beta in select regions of Google Cloud Platform.

As an advocate of Kubernetes, and a budding machine learning developer, I am very excited to see the availability of GPUs. This capability will bring highly scalable training and inferencing to machine learning jobs deployed on Kubernetes.

Assuming you have a valid GCP account, and the Google Cloud SDK configured on your development machine, you can launch a GPU-backed Kubernetes cluster.

Let’s start by verifying the available accelerators and supported regions in the GCP cloud.


The output confirms the availability of Nvidia Tesla K80 and P100 GPU accelerators in a few regions.

We will now launch a GKE cluster in asia-east1-a zone with two nodes. This is a normal cluster with no GPU nodes. After the cluster is provisioned, we will add a couple of nodes with GPU.


With the cluster in place, we will now create a node pool with GPU-specific nodes. A node pool is a subset of node instances within a cluster that all have the same configuration.

When we create a container cluster, the number and type of nodes that are specified become the default node pool. Then, we can add additional custom node pools of different sizes and types to the cluster. All nodes in any given node pool are identical to one another.

The following command will create a new node pool and adds it to the existing cluster. The advantage of this approach is that each node pool can be scaled separately. Though we are only adding a single node initially, we can easily expand or shrink the pool based on the workload.


The command used above is loaded with switches. Notice the switch –accelerator that mentions the type of GPU to use along with the number of GPUs. It is possible to add more than one GPU to a node in the pool.

Now the cluster has an additional GPU-backed node.


When a GPU node is added to the cluster, GKE runs a plugin as a pod on that specific node.

Checking the pods in the kube-system namespace will confirm this.


We also need to install the device driver as a DaemonSet that targets each GPU node in the cluster. Google provides a YAML file with the DaemonSet definition.

The installation takes several minutes to complete. Once installed, the Nvidia GPU device plugin exposes Nvidia GPU capacity via Kubernetes APIs.


After a few minutes, the driver shows up in the kube-system namespace.

We are all set to run a GPU workload on the cluster. Let’s start by deploying an Ubuntu 16:04 image to check out the Nvidia configuration.


The above command creates a deployment called cuda with the GPU limit set of 1. Depending on the number of GPUs added to the nodes as a part of node pool, we can allocate the GPU resources to the pod. The command also sets an environment variable to add the binaries and libraries to the pod.

If everything goes well, we should be inside the shell of the Ubuntu container.

Navigate to the /usr/local/nvidia/bin directory to run the customary nvidia-smi command.

Congratulations! You are all set to run massively parallelizable workloads on Kubernetes.

If you are running a pod instead of a deployment, use the following declaration with nodeSelector to create the affinity. This will ensure that the pod is always scheduled on a node with GPU.


In the next part of this tutorial, we will create a machine learning training job to build a Caffe model on the GKE cluster. That’s an exciting use case to exploit the combined power of Kubernetes and GPUs. Stay tuned!

Feature image: A 3D visualization of a heart taken generated by GPUs from 2D MRI images, as demonstrated at the Nvidia GPU Technical Conference.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Docker.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.