Where the Cloud Won’t Work: Machine Learning for the Industrial Internet of Things
A quiet race is going on to set up the infrastructure needed for the industrial Internet of Things (IoT). It is generally agreed that the cloud model won’t work to manage sensor data in real time, so instead hardware and network providers are rushing to evolve their technologies and sign up industrial customers to pilot and early implementation initiatives in edge processing.
Stage one of the race is well underway, with the current focus on enabling edge processing on hardware gateways located in the field (factories, workplaces, cities, farms and buildings). To do that, many are leveraging Dockerized containers (and Moore’s Law) to do more powerful data processing. Once this infrastructure has a little more robustness behind it, introducing machine learning (ML) at the edge will spark a second wave of the race.
Edge Processing in the Industrial Internet of Things
In an IoT architecture, sensors will collect data in real-time at intervals of seconds and microseconds.
In manufacturing and mining, sensors might be measuring vibration and noise levels of machinery and platforms, or monitoring for gas or water leaks. In precision agriculture, sensors are taking temperature, wind velocity, soil moisture and humidity readings. In smart city implementations, sensors are tallying pedestrian movements under public lighting, checking parking bay vacancies, and tracking traffic flow.
All of these scenarios involve the sensors taking multiple readings per minute or even per second, building up a huge pile of mostly useless data points with occasional actionable metrics. For the most part, the sensors are confirming no change in readings or changes within a normal range. But occasionally they may indicate that a platform is vibrating at an unsafe level, that a water leak is occurring, that the soil is dry, or that a parking bay has just been vacated. When these sensor readings occur, the sensor needs to create an alert and take an action: turn off a piece of equipment, set off an alarm, turn on a sprinkler system, or update a GPS map.
IoT Data Scientist Ajit Jaokar and convener of the Implementing Enterprise AI course has written that the sheer volume of sensors in a typical industrial use case means that pinging the cloud with data readings is useless. “A typical Formula One car carries 150-300 sensors … The current Airbus A350 model has close to 6,000 sensors and generates 2.5 Tb of data per day. A city (for example, the Smart city of Santander in Spain) includes a network comprising more than 25,000 sensors,” Jaokar pointed out.
Instead, edge processing (also sometimes referred to as “fog computing”) allows sensors to send readings to a hardware gateway with some processing power that acts as a hub for a cluster of sensors, and is able to determine if any data received is worth acting on.
The Time (for Edge Processing) is Nigh
While the term ‘edge processing’ have been promoted for around a year now, the truth is that it was only in the the final quarter of last year that industry shifted gears and began to take edge processing seriously.
Aaron Allsbrook, Chief Technology Officer at IoT Platform ClearBlade said that it was as late as October or November last year that really saw the market shift around edge processing.
“The introduction of Amazon’s Greengrass product was announced and that helped qualify the market. It always helps when one of the bigger players helps define the space,” explained Allsbrook. At ClearBlade, he said most of his enterprise customers had already completed some form of IoT pilot, had assessed the platforms on the market, and were coming to ClearBlade to start implementing new rollouts based on what they had learned during their pilot stages. He said in one case, for example, a customer with factories and warehouses was now looking at the edge as a solution for their sensor infrastructure, knowing they couldn’t do everything in the cloud.
“The model that most people are doing now involves binding agents that run on gateway hardware,” said Allsbrook. “But the stack that we are using in our IoT platform leverages messaging and writing APIs. We are bringing down datastores, the security model, the API layer, and messaging brokerage and can execute that at the gateway layer.”
Allsbrook argues that the problem with the way many are thinking about edge processing is to just connect sensors to a gateway and have some processing power there to decide what data to store in the cloud and what data is more urgent and needs immediate action. But he says that is too short-sighted for industrial infra.
“People are trying to connect up factories and connecting core processes to it. They have to be able to self-contain it. If you don’t bring the whole stack, you can’t execute. You need to have all the rules nearby, and you want to get under 10 milliseconds latency. You can’t put enough security there, you need to have a full device profile before any data rolls off the gateway.”
Docker at the Edge
Larger competitors and startups are competing head to head to build out the IoT edge processing infrastructure. In a model a little reminiscent of how banks need to partner with financial technology startups to be competitive in a digital landscape, large network providers are partnering with newer hardware makers and beginning to eye up machine learning startups for the next wave.
Foghorn Systems, for example, provides a software platform direct to customers as well as partnering with GE Predix, Microsoft, Cisco and Dell, amongst others. “90 percent of the data at the edge comes and goes unless it is acted on immediately”, says David King, CEO of Foghorn Systems. “Basically, there are two paradigms in edge processing. The first is to get data into the cloud as soon as possible and then do all the analytics on that. The second approach is to put a layer of compute right next to the machines.”
King says with the increased capability of putting processing power on small hardware units means that they can offer two products. The first allows a small complex event processing engine to sit on gateway hardware. This runs an application inside a Sockerized container installed via firmware on the gateway device directly. He says this could be run on a wind turbine or a drill, for example, offering low cost compute power directly at the sensor location.
King says about 70 percent of his customers are taking this approach. “They create a Linux Docker application, sometimes in Java, but mostly people are using C++. Then sensor brokers bring the data in, there is a processing layer that has a high-speed data bus, and then a complex event processing engine, and an SDK that can run apps. It is a complete edge stack, there is a lot of real-time local processing, and then also intelligent data is cached to the cloud.”
King says a micro-version of that is used in the other 30 percent of cases. The micro-version runs on a 256MB hardware and can be put into an existing gateway. He says this approach is suited to outdoor environments or machinery that simply does not have the space to bolt on a gateway hardware piece, such as in automative machinery.
Cisco has released their IOx product as a virtual machine that can run on a router. Jock Reed, IoT Developer Evangelist at Cisco DevNet, says that IOx is predominantly used to enable Docker containers to be run at the edge. “You can develop a Docker app for the data you are trying to get,” said Reed. “That way, when you are developing your app you can deploy it to the IOx host on the router.” IOx gives the router more compute capacity. It is used to speed up compute processing in industrial use cases, and can continue operating when offline, a key consideration for many IoT industrial implementations that may only want to connect to the cloud once a day to do a data dump, but want to be able to still compute and track data across a sensor architecture in real-time throughout the rest of the day.
Reed says IOx with FogDirector is a rapidly evolving platform, already capable of doing orchestration of all edge devices and acting as the DevOps layer for edge processing gateways. Cisco DevNet provides and is continuing to build remote labs that offer developers and partners the ability to learn and experiment with the edge devices and the fog orchestration layer.
Introducing ML to the Edge
Some startups like VIMOC Technologies are already introducing machine learning at the edge, while others say this need is yet to fully develop amongst industrial and enterprise customers. Most are just setting up their infra at present and working to get data collection flowing alongside complex event processing that can track which data to store, which to act on immediately, and which to discard.
Tarik Hammadou, CEO and co-founder at VIMOC Technologies has built both hardware (VIMOC’s neuBox that has both sensors and a compute layer included), and a hardware-agnostic software platform that operates at the cloud level where applications can be built and connected via API to sensors and gateways.
“From an architecture point of view, edge computing is very simple,” said Hammadou. “But the key is how efficiently you distribute processing between different nodes at the edge and doing complex tasks at the edge. Cooperation between nodes and the cloud is the main challenge we took on over the last three years.”
VIMOC’s sensors and platform have been taken up by parking garages to optimize parking spaces and already Hammadou has introduced deep learning algorithms on the gateway to better understand the sensor readings being collected.
“We are using deep learning and vehicle classification to optimize parking. To accurately do visual intelligence, you need a vehicle recognition system. Our deep learning recognizes with a high level of accuracy when a vehicle enters the parking garage and as we move forward in our roadmap, we will be able to give garages information about the types of vehicles that are going into their garages.”
Allsbrook says that for the bulk of his customers, they are not yet ready to experiment with machine learning at the edge. “We can learn from machine learning algorithms processing data in stored in the cloud, and then execute what we learn from that at the edge, but our customers aren’t there yet. They move the data from the edge to the cloud for that sort of analytics.”
For the most part, Allsbrook said, industry is still learning what data their sensors need to be collecting, how to structure that data and still choosing what algorithms will be useful to understand the volume of data that is collected.
At Cisco, Reed, and Developer Evangelist Amanda Whaley say they are making sure Cisco’s IOx and other IoT initiatives can partner with open source related approaches to data analysis. “We are looking at PNDA, a big data analytics project that is part of the Linux Foundation, Zeus, and individual partnerships with Watson and other machine learning vendors,” said Whaley.
King at Foghorn says that there are a “tremendous number of use cases” in outdoor industries, smart traffic management, parking, and traffic flow for retail, but that adding ML to edge processing is still “two or three years out. What we are seeing is that there are layers of machine learning. At Foghorn, we are processing at the millisecond at the edge. That is a different type of predictive analytics than what you do from the cloud. The edge is about doing real-time responses, so ML might be used for optimizing the way the lights work or knowing when to capture real-time video and audio feeds.”
2017 in Edge Processing
Further advances in both the technology and machine learning sides of edge processing will speed up this year. While some say that hardware has limited new potential to innovate and that it will be the ML processing that sees the bulk of the industry growth, the tech models that Cisco, Foghorn, ClearBlade and VIMOC are implementing suggest there is still plenty to learn in how best to configure the Industrial IoT.
Industry appears increasingly ready to invest in implementation rollouts instead of just pilot projects, and this may help define and consolidate the edge processing market this year. What’s more, it will only become apparent as enterprises start to put data into training models in the cloud that they will begin to understand what ML processing will be able to be done at the edge.