Intel is focusing on optimizing its processors for inferencing the machine learning models. Irrespective where the models are trained, Intel wants the inferencing done on its infrastructure.
At the core of this strategy is the Myriad Vision Processing Unit (VPU), an AI-optimized chip for accelerating vision computing based on convolutional neural networks (CNN). According to Intel, Myriad VPUs have dedicated architecture for high-quality image processing, computer vision, and deep neural networks, making them suitable to drive the demanding mix of vision-centric tasks in modern smart devices.
While Myriad is a System-on-Chip (SoC) board, Intel has extended the same technology to Movidius Neural Compute Stick (NCS). These devices look like USB sticks that can be easily attached to edge devices such as Intel NUC or Raspberry Pi.
Currently, there are two versions of NCS devices available in the market. NCS 2, the latest version was launched in late 2018. The previous generation of NCS devices are still available in the market and continue to address scenarios unique to them. This article takes a closer look at the hardware and software architecture of Intel Movidius NCS 1.
Intel NCS 1
The first generation of the Neural Compute Stick changed the face of ML model inferencing at the edge. It is based on the Intel Movidius Myriad 2 VPU, the same high-performance, always-on device found in millions of smart security cameras, gesture-controlled drones, industrial machine vision equipment, and other products. It comes with 12 Streaming Hybrid Architecture Vector Engine (SHAVE) Cores that can run computations in parallel.
Intel NCS 1 can be attached to an Ubuntu 16.04 PC or a Raspberry Pi running Raspbian Stretch OS.
There are three steps involved in running deep learning models on edge devices powered by Intel NCS 1:
- Train the model on a GPU-based infrastructure using TensorFlow or Caffe
- Optimize the trained model as a compiled graph to run on Intel Movidius
- Load the graph onto the device for inferencing
The first step is typically performed in the cloud by exploiting the deep learning platforms backed by GPUs.
Once developers evaluated the trained model for performance and accuracy, the model needs to be compiled and optimized for Movidius. For the second step, Intel recommends using an Ubuntu 16.04 development machine connected to NCS1. The Neural Compute SDK (NCSDK), an open source project available on GitHub, comes with all the tools and libraries to generate graphs from TensorFlow and Caffe models.
Once an optimized model is generated, it is programmatically loaded into NCS for accelerating the inference. Intel NCSDK comes with Python and C++ libraries to deal with this process.
NCSDK has a subset called the API mode that excludes the tools required to optimize the model. When using NCS on edge devices such as Raspberry Pi, we can configure the SDK in API-only mode, which comes with a minimal set of libraries just enough to load the models onto the stick.
NCSDK V2, the most recent release includes simplified tools and SDK. Unfortunately, it’s not backward compatible with the previous version. But, a shim for V1 makes it easy to port existing code to V2.
The best thing Intel did with Moviduis was to create Model Zoo, a repository of models. These models are fully optimized for Movidius and are ready to use. For most of the scenarios, an existing model can be easily extended and customized.
Intel NCS 2
Launched last November, Neural Compute Stick 2 is the latest incarnation of the device. It is powered by Myriad X VPU that comes with 16 Shave Cores. Intel claims that it is at least eight times faster compared to the previous version.
Unlike NCS 1, this device doesn’t have an exclusive set of tools and SDK. Intel has added NCS 2 support in OpenVINO Toolkit, the software platform for optimizing ML models. The new toolkit is built for other hardware platforms including Intel Arria and FPGA runtime environments.
Intel Distribution of OpenVINO Toolkit supports Ubuntu, CentOS, and Yocto Linux distributions along with Microsoft Windows and Raspbian 32-bit OS.
Intel is gradually transforming OpenVINO Toolkit into a unified platform for model optimization and inferencing.
Like NCSDK, OpenVINO Toolkit includes tools for generating an Intermediate Representation (IR) from TensorFlow, Caffe, and Apache MXNet models. It also supports Open Neural Network Exchange (ONNX) for importing and exporting deep learning models across multiple frameworks.
In the next part of this article, I will show you how to use Intel Neural Compute Stick with a Raspberry Pi. We will convert a Caffe model into a graph optimized for inference. Stay tuned.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar for a deep dive on accelerating machine learning inference with Intel Movidius.