The rise of industrial Internet of Things (IoT) and artificial intelligence (AI) are making edge computing significant for enterprises. Many industry verticals such as manufacturing, healthcare, automobile, transportation, and aviation are considering an investment in edge computing.
Edge computing is fast becoming the conduit between the devices that generate data and the public cloud that processes the data. In the context of machine learning and artificial intelligence, the public cloud is used for training the models and the edge is utilized for inferencing.
Cloud TPU: Accelerating ML Training in the Cloud
To accelerate ML training in the cloud, public cloud vendors such as AWS, Azure, and the Google Cloud Platform (GCP) offer GPU-backed virtual machines. Today, NVIDIA’s GPUs are the most preferred accelerators by ML researchers and data scientists for training models in the cloud.
GPUs are designed as general-purpose processors to perform compute-intensive calculations in parallel. Given how fast they do matrix addition and multiplication, GPUs are extensively used in scenarios like video rendering and high-performance computing (HPC). ML training and inferencing is one of the most recent use cases enabled by GPUs. Even though GPUs are the most preferred for ML training, they are not exclusively designed for it.
Machine learning has been the core of many products and services at Google. To accelerate ML jobs, Google built a custom ASIC (Application Specific Integrated Circuit) called the Tensor Processing Unit (TPU). Unlike a GPU, TPU is highly optimized for massively parallel calculations required by neural networks. Google has been using TPUs for a variety of applications such as Search, Photos, and Translate. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning.
With its aggressive investment in the cloud, Google has been exposing some of the internal tools and platforms to GCP customers. Cloud TPU is one of those services that is available to GCP users.
Cloud TPU enables customers to run machine learning workloads on Google’s TPU accelerator hardware using TensorFlow. Cloud TPU is designed for maximum performance and flexibility to help researchers, developers, and businesses to build TensorFlow compute clusters. High-level Tensorflow APIs help developers to get models running on the Cloud TPU hardware. Cloud TPUs deliver the best performance of training TensorFlow models at an affordable price. They are the cheapest ML accelerators available in the public cloud.
Edge TPU: Accelerating ML Inferencing at the Edge
Recently, Google has announced the availability of Edge TPU, a miniature version of Cloud TPU designed for single board computers and system on chip devices. Though an Edge TPU may be used for training ML models, it is designed for inferencing.
Google recommends using its cloud services such as AI Platform and AutoML Vision to train TensorFlow models and converting them to run on Edge TPU. TensorFlow Lite is a flavor of TensorFlow meant for mobile devices and low-powered environments. Existing TensorFlow models may be converted into TensorFlow Lite format that run on iOS and Android. The same TensorFlow Lite model can be further optimized for Edge TPU. Google has a web compiler and a command line to transform TF Lite models for the Edge.
Edge TPUs modules have a low footprint that makes it ideal for embedding them in devices such as drones, cameras, and scanners.
Coral: The Platform to Program the Edge TPU
Before an Edge TPU is deployed in production, it needs to be programmed to run a specific TensorFlow model. To enable developers to easily program and debug models used for inferencing, Google has built a platform called Coral. It is a combination of hardware and software enabling developers to quickly prototype AI solutions.
Currently, there are two Coral devices available for developers to program AI at the edge – Coral Dev Board and USB Accelerator.
The Coral Dev Board is a single-board computer with a removable system-on-module (SOM) featuring the Edge TPU. When it comes to the looks, it closely resembles a Raspberry Pi. The device comes with NXP i.MX 8M SOC CPU, an integrated GC7000 Lite Graphics GPU, and of course an Edge TPU for ML acceleration. The device boasts of 1 GB LPDDR4 RAM and 8 GB eMMC. It runs a Mendel Linux, a flavor of Debian optimized for devices.
The Coral Dev Board is a stand-alone device that can have its own camera module for inferencing images and videos in real-time. It also has a set of GPIO compatible with Raspberry Pi headers. TensorFlow Lite models optimized for Edge TPU can run natively on the device. The key use case of Coral Dev Board is fast prototyping of edge solutions that use TensorFlow models for solving computer vision problems.
Google has another device that comes with an Edge TPU called the USB Accelerator. Resembling a USB stick, this device may be plugged into a USB port of any Debian host including a Raspberry Pi. It comes with a USB-C type connector for better throughput and power. The host machine needs to run the Coral SDK to offload TF Lite models to the Edge TPU. Developers may use the USB Accelerator to prototype AI solutions on any existing Linux host.
The Coral Dev Kit costs $150 while the USB Accelerator is available for $75. They can be ordered from one of the online retailers listed by Google.
In one of my upcoming articles, I will walk you through the workflow involved in optimizing a TensorFlow model for Edge TPU. Stay tuned.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live