How I Built an ‘AIoT’ Project with Intel AI Vision X Developer Kit and Arduino Yun

IoT and AI are two independent technologies that have a significant impact on multiple industry verticals. While IoT is the digital nervous system, AI becomes the brain that makes decisions that control the overall system. The lethal combination of AI and IoT brings us AIoT (Artificial Intelligence of Things) that delivers intelligent and connected systems that are capable of self-correcting and self-healing themselves.
During the last couple of years, AI has become extremely accessible to developers. From simple cognitive APIs to AutoML to the infrastructure required for training sophisticated deep learning algorithms, AI is not only accessible but also affordable. Industrial IoT is one of the key domains benefit from the infusion of AI.
To demonstrate the evolving concept of AIoT, I chose the combination of Intel AI Vision X Kit and Arduino Yun. For inferencing the ML model, I used Intel OpenVINO Toolkit. For an introduction to Intel OpenVINO Toolkit, refer to my previous article.
A camera connected to the Vision AI X Kit acts as an intelligent image sensor that can detect objects through OpenCV and OpenVINO Toolkit. It publishes the labels of the detected objects to an MQTT topic to which the Arduino Yun devices are subscribed. When an object of interest is detected, Yun takes action by changing the state of an actuator. It could be as simple as changing the color of an LED or controlling a relay.
When it is deployed at a toll gate, the color of LED and the value shown in the display change based on the vehicle type.
This scenario also demonstrates how AI Vision X Kit is a powerful edge computing platform. In terms of hardware, the kit is powered by an Intel Atom x7-E3950 CPU, 8GB RAM, and 64 GB eMMC. The best thing about the kit is that it comes with an embedded Intel Movidius Myriad X Vision Processing Unit (VPU) to accelerate AI models. The OpenVINO Toolkit is configured to talk to the VPU through the Inferencing Engine plugin.
Being an x86 machine, the AI Vision X Kit runs fully-fledged Ubuntu 18.04 on which OpenVINO Toolkit is installed. I also installed Mosquitto MQTT server that acts as the message bus connecting the Arudino Yun-based microcontrollers. All the devices are connected to a local WiFi access point that provides the connectivity to the AIoT setup.
Below is a list of items used in this project:
- Intel AI Vision X Kit (1 ea)
- Logitech C270 Webcam (1 ea)
- Arduino Yun Rev 2 Board (2 ea)
- Grove Shields for Arduino (2 ea)
- Grove Red LED (1 ea)
- Grove Green LED (1 ea)
- Grove 4-Segment Display (1 ea)
Since the Intel AI Vision X Kit acts as an edge computing device responsible for object detection, we run an OpenCV-based application backed by Intel OpenVINO Toolkit that analyses the feed from the connected camera.
The deep learning model used for object detection is based on MobileNet SSD Caffe model. We will download the model file using the downloader utility provided by OpenVINO Toolkit.
The below diagram depicts the high-level architecture of this solution.
1 2 3 |
$ mkdir AIot $ cd AIoT $ /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name mobilenet-ssd -o model |
You can find the MobileNet SSD V1 Caffe model in the ./model/object_detection/common/mobilenet-ssd/caffe/ directory.
The next step is to optimize this model using OpenVINO Toolkit’s Model Optimizer tool.
1 2 3 |
$ /opt/intel/openvino/deployment_tools/model_optimizer/mo_caffe.py \ --input_model model/object_detection/common/mobilenet-ssd/caffe/mobilenet-ssd.caffemodel \ --output_dir model/FP16 |
The FP16 directory under ./model directory has an optimized Caffe model that be used with OpenVINO Toolkit’s Inference Engine plugin. You will find the below files generated by the Model Optimizer:
1 2 |
$ ls ./model/FP16/ mobilenet-ssd.bin mobilenet-ssd.mapping mobilenet-ssd.xml |
We are now ready to utilize the optimized model with OpenCV.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
from imutils.video import VideoStream from imutils.video import FPS import numpy as np import argparse import imutils import time import cv2 import paho.mqtt.client as mqtt prototxt="./model/FP16/MobileNetSSD_deploy.xml" model="./model/FP16/MobileNetSSD_deploy.bin" conf=0.5 font = cv2.FONT_HERSHEY_SIMPLEX freq = cv2.getTickFrequency() frame_rate_calc = 1 broker_address="10.0.0.10" client = mqtt.Client("smartcam") client.connect(broker_address) LABELS = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"] COLORS = np.random.uniform(0, 255, size=(len(LABELS), 3)) net = cv2.dnn.readNet(prototxt, model) net.setPreferableTarget(cv2.dnn.DNN_TARGET_VPU) vs = VideoStream(usePiCamera=False).start() time.sleep(2.0) fps = FPS().start() while True: frame = vs.read() frame = imutils.resize(frame, width=400) t1 = cv2.getTickCount() (h, w) = frame.shape[:2] blob = cv2.dnn.blobFromImage(frame, 0.007843, (h, w), 127.5) net.setInput(blob) detections = net.forward() for i in np.arange(0, detections.shape[2]): confidence = detections[0, 0, i, 2] if confidence > conf: idx = int(detections[0, 0, i, 1]) box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") label = LABELS[idx] print(label) cv2.rectangle(frame, (startX, startY), (endX, endY), COLORS[idx], 2) y = startY - 15 if startY - 15 > 15 else startY + 15 if label == 'bus' or label == 'car': client.publish('cam/infer',label) else: client.publish('cam/infer',"none") cv2.putText(frame, label, (startX, y),font, 0.5, COLORS[idx], 2) cv2.putText(frame,"FPS: {0:.2f}".format(frame_rate_calc),(30,50),font,1,(255,255,0),2,cv2.LINE_AA) cv2.imshow("Frame", frame) key = cv2.waitKey(1) & 0xFF t2 = cv2.getTickCount() time1 = (t2-t1)/freq frame_rate_calc = 1/time1 if key == ord("q"): break fps.update() cv2.destroyAllWindows() vs.stop() |
The below lines of code load the optimized model and delegate the inferencing to Movidius Myriad X VPU.
1 2 |
net = cv2.dnn.readNet(prototxt, model) net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU) |
When the detected object happens to be a car or a bus, a message gets published to MQTT topic with the actual label.
1 2 3 4 |
if label == 'bus' or label == 'car': client.publish('cam/infer',label) else: client.publish('cam/infer',"none") |
One of the Arudino’s controls the LED, based on the message published to the MQTT topic.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
#include #include #include BridgeClient net; MQTTClient client; unsigned long lastMillis = 0; const int RED_LED=2; const int GREEN_LED=4; void connect() { Serial.print("connecting..."); while (!client.connect("bulb")) { Serial.print("."); delay(1000); } Serial.println("\nconnected!"); client.subscribe("cam/infer"); } void messageReceived(String &topic, String &payload) { Serial.println(payload); if (payload == "bus") { digitalWrite(RED_LED, HIGH); digitalWrite(GREEN_LED, LOW); } if (payload == "car") { digitalWrite(GREEN_LED, HIGH); digitalWrite(RED_LED, LOW); } if (payload == "none") { digitalWrite(GREEN_LED, LOW); digitalWrite(RED_LED, LOW); } } void setup() { Bridge.begin(); Serial.begin(115200); pinMode(GREEN_LED, OUTPUT); pinMode(RED_LED, OUTPUT); client.begin("10.0.0.10", net); client.onMessage(messageReceived); connect(); } void loop() { client.loop(); if (!client.connected()) { connect(); } } |
The other Arduino device shows the toll gate fee on a 7-segment display.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
#include #include #include #include "TM1637.h" #include #include const int CLK = 6; const int DIO = 7; TM1637 tm1637(CLK,DIO); BridgeClient net; MQTTClient client; void connect() { Serial.print("connecting..."); while (!client.connect("display")) { Serial.print("."); delay(1000); } Serial.println("\nconnected!"); client.subscribe("cam/infer"); } void messageReceived(String &topic, String &payload) { Serial.println(payload); tm1637.clearDisplay(); if (payload == "bus") { tm1637.displayNum(100); } if (payload == "car") { tm1637.displayNum(50); } if (payload == "none") { tm1637.displayNum(0); } } void setup() { Bridge.begin(); Serial.begin(115200); tm1637.init(); tm1637.set(BRIGHT_TYPICAL); client.begin("10.0.0.10", net); client.onMessage(messageReceived); connect(); } void loop() { client.loop(); if (!client.connected()) { connect(); } } |
This tutorial leveraged AI along with IoT to demonstrate a real-world scenario.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.
Feature image by SplitShire from Pixabay.