TNS
VOXPOP
How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
0%
No change in plans, though we will keep an eye on the situation.
0%
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
0%
What recent turmoil?
0%

Tutorial: Deploying TensorFlow Models at the Edge with NVIDIA Jetson Nano and K3s

In this tutorial, we will explore the idea of running TensorFlow models as microservices at the edge. Jetson Nano, a powerful edge computing device will run the K3s distribution from Rancher Labs. It can be a single node K3s cluster or join an existing K3s cluster just as an agent.
Aug 28th, 2020 9:07am by
Featued image for: Tutorial: Deploying TensorFlow Models at the Edge with NVIDIA Jetson Nano and K3s

In this tutorial, we will explore the idea of running TensorFlow models as microservices at the edge. Jetson Nano, a powerful edge computing device will run the K3s distribution from Rancher Labs. It can be a single node K3s cluster or join an existing K3s cluster just as an agent.

For background, refer to my previous article on Jetson Nano and configuring it as an AI testbed.

For the completeness of the tutorial, we will run a single node K3s on Jetson Nano. If you want to turn that into an agent, follow the steps covered in one of the previous articles from the K3s series.

Step 1: Configure Docker Runtime

The Jetson platform from NVIDIA runs a flavor of Debian called L4T (Linux for Tegra) which is based on Ubuntu 18.04. The OS along with the CUDA-X drivers and SDKs is packaged into JetPack, a comprehensive software stack for the Jetson family of products such as Jetson Nano and Jetson Xavier.

Starting with JetPack 4.2, NVIDIA has introduced a container runtime with Docker integration. This custom runtime enables Docker containers to access the underlying GPUs available in the Jetson family.

Start by downloading the most recent version of JetPack and flash your Jetson Nano device with it.

Check the version of Docker runtime with the below command:


Since Docker supports custom runtimes, we can use the standard Docker CLI with --runtime nvidia switch to use NVIDIA’s container runtime.

Instead of using the switch for every invocation, we can turn that into the default runtime by modifying the /etc/docker/daemon.json file to add the line "default-runtime": "nvidia".


Make sure you restart the Docker service or reboot your system before proceeding.

Step 2: Install K3s on Jetson Nano

The default container runtime in K3s is containerd, an industry-standard container runtime. This means that Docker CE and K3s will not share the same configuration and images.

For the AI workloads running in K3s, we need access to the GPU which is available only through the nvidia-docker runtime. In the previous step, we already configured Docker to use the custom runtime.

Fortunately, K3s has an option to use existing Docker runtime instead of containerd. This is possible by adding the --docker switch to the installation script.

Let’s go ahead and install K3s on NVIDIA Jetson Nano pointing it to the Docker runtime. We will also add a couple of other switches which makes it easy to use the kubectl CLI with K3s.



Within a few minutes, K3s is up and running on our Jetson Nano.


Step 3: Run TensorFlow as a Kubernetes Pod on Jetson Nano

With the Kubernetes infrastructure available, we will try to run TensorFlow 2.x as a pod in our single node cluster powered by K3s.

NVIDIA has published a set of container images that are optimized for JetPack to run at the edge. They are available in the NVIDIA GPU Cloud (NGC) container registry.

Let’s pull the TensorFlow 2.2 container image for L4T from NGC.


Let’s see if TensorFlow can access the GPU available on Jetson Nano.


Within the Python shell, run the below code snippets to check the version and GPU access:




As we can see, GPU device 0 is visible to TensorFlow.

Now, it’s time to see if we can run this as a Kubernetes pod and still access the GPU.

Create a simple pod specification which will keep the TensorFlow 2.2 container running.




With TensorFlow pod running, let’s access the shell and try the same commands.


You should see that the GPU is available to TensorFlow.

Accessing the GPU from a K3s cluster through custom Docker runtime is a powerful mechanism to run AI at the edge in a cloud native environment. With TensorFlow running at the edge within Kubernetes, you can deploy deep learning models as microservices.

This enables many interesting use cases to bring the best of AI and IoT to Kubernetes infrastructure. In one of the upcoming tutorials, I will cover an end-to-end AI inference use case based on this platform. Stay tuned.

Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Docker.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.