How I Built an AWS DeepLens Clone with the Horned Sungem AI Camera
When I first saw the demo of AWS DeepLens at re:Invent 2017, I was excited by the possibilities of a smart camera. Exactly a year later, I came across Horned Sungem, a tiny camera powered by Intel’s Myriad Vision Processing Unit (VPU).
With the combination of a Raspberry Pi Zero and Horned Sungem camera, I could quickly build an AWS DeepLens clone. I ported one of my original DeepLens demos that automatically recognizes the vehicle type and calculates the fee at a toll gate. Since there is no display attached, I used an Amazon Echo to provide the conversational user experience.
In the most recent version of the demo based on Horned Sungem, I conveniently replaced Amazon Echo with a Google Home device to provide the same experience. I also added a button to send out a WhatsApp message with the vehicle type and toll fee to the user.
Here is the list of hardware devices I used for the demo:
- Raspberry Pi Zero W
- Grove Pi Zero Hat
- Grove LEDs
- Grove Push Button
- Horned Sungem Smart Camera
- Google Home
- Power Banks
Meet Horned Sungem, the AI Camera
Horned Sungem (HS) is a Chinese company specializing in AI. The AI camera built for developers, students, hobbyists, and enthusiasts to create their own AI applications with ease.
The device has a USB-C connector that can be plugged into a Raspberry Pi or any other computing device such as a PC or Mac. It has native support for the Raspberry Pi Camera connected through the CSI interface. The camera comes with two ports — USB-C and micro-USB. The USB-C type connector is used for connecting the camera to the computing device while the micro-USB is for external power. When connected to a low-power device like a Raspberry Pi Zero, the camera doesn’t get enough power from the host. An external power source helps HS to work at its peak performance.
The best thing I love about HS is the Python SDK that runs on ARM or Intel host computing devices. The SDK dramatically simplifies implementing machine learning models for inference. Fully trained models for object detection and face recognition based on popular datasets such as CIFAR-10 and PASCAL-VOC are just a method call away.
All the developer has to do is to import the HS API and call a method by passing the model name. The output comes back with the label of the detected object.
I used the object detection model trained with the PASCAL-VOC dataset to identify bus and car. For more details on using the SDK, refer to HS SDK on Github.
from hsapi import ObjectDetector
net = ObjectDetector(zoom = True, verbose = 2)
video_capture = cv2.VideoCapture(0)
vehicle = memcache.Client(['127.0.0.1:11211'], debug=0)
_, img = video_capture.read()
result = net.run(img)
label = net.labels[result]
if (b_state == 1):
img = net.plot(result)
cv2.imshow("Toll Gate", img)
Dear IoT, Meet AI
As soon as a car or a bus is detected, an LED glows providing a visual hint of the vehicle type. To achieve this, I connected a Grove Pi Hat to Raspberry Pi Zero W which gives me access to the Grove family of sensors and actuators. A Red and Green Grove LEDs are connected to the digital pins to represent the type of vehicle. A button is connected to send out a WhatsApp message.
Based on the detected vehicle type, I control the LEDs. The code is pretty straightforward.
Adding Voice to Smart Camera with Google Assistant
The fascinating part of the demo is connecting Google Home to the camera. I used Dialogflow to create an Action that talks to the camera.
To facilitate the communication between Raspberry Pi Zero and Dialogflow, I pointed the fulfillment webhook of the Intent to a ngrok endpoint running on Pi. Ngork exposed a Flask REST endpoint to Dialogflow through a tunnel. The beauty of this model is that everything runs on the Pi without the need of an external web service.
Since Raspberry Pi Zero is a constrained device, I didn’t want to install a database to store the outcome of object detection. I simply used Memcache to write the output from the object detection code that is read by Flask code when the webhook is invoked. Memcache came as a handy in-memory database to share data between two independent Python programs — object detection and webhook.
Below is the code for implementing the Dialogflow webhook.
from __future__ import print_function
from future.standard_library import install_aliases
from urllib.parse import urlparse, urlencode
from urllib.request import urlopen, Request
from urllib.error import HTTPError
from flask import Flask
from flask import request
from flask import make_response
app = Flask(__name__)
req = request.get_json(silent=True, force=True)
res = processRequest(req)
res = json.dumps(res, indent=4)
r = make_response(res)
r.headers['Content-Type'] = 'application/json'
res = makeWebhookResult()
"source": "Smart Toll Gate demo with Horned Sungem"
if __name__ == '__main__':
port = int(os.getenv('PORT', 8080))
print("Starting app on port %d" % port)
shared = memcache.Client(['127.0.0.1:11211'], debug=0)
app.run(debug=False, port=port, host='0.0.0.0', threaded=True)
Sending WhatsApp Messages through Twilio API
When one of the LEDs is lit, the push button may be pressed to send out a WhatsApp message. I tapped into the most recent feature of Twilio messaging to implement this feature. Using WhatsApp API is not very different from the classic SMS API of Twilio SDK.
The code below sends the message via Twilio:
account_sid = ""
client = Client(account_sid, auth_token)
message = client.messages.create(to="whatsapp:",from_="whatsapp:SRC_PHONE_NO",body=vehicle.get("Prompt"))
In less than 150 lines of code, I could implement an end-to-end smart camera application that emulated AWS DeepLens.
You can access the complete source code from this Github Gist.