Architecting for Industrial IoT Workloads: A Blueprint
The manufacturing landscape has evolved beyond recognition since the Industrial Revolution. Back in the day, those with the most manpower or the largest fleet of machines owned the lion’s share of the market. Fast forward to today, machines are no longer the center of attention in manufacturing — but the data they produce.
The emergence of the Internet of Things (IoT), Big Data analytics and cloud computing has shifted the manufacturing paradigm from labor-intensive processes to data-driven automated factories. Businesses that harness data to understand, operationalize and control machines gain a sharp competitive edge in the market with continuous innovation, efficiency and cost reduction.
As a helpful nudge in that direction, let’s look at a reference architecture for Industrial Internet of Things (IIoT) workloads that can be efficiently implemented in manufacturing plants. But first, let’s get on the same page about what IIoT involves.
Industry 4.0 and IIoT
While the term IoT encompasses a wide range of applications across various industries, IIoT specifically focuses on integrating smart devices and advanced analytics in the industrial sector to improve efficiency, productivity and overall operations. For example, in the manufacturing industry, IIoT sensors are commonly used to capture data on machine performance to predict equipment failures and automate quality control in real time.
The impact of IIoT cannot be understated, and the evidence only continues to strengthen in its favor. One McKinsey study, for instance, revealed that IoT-powered predictive maintenance can reduce downtime by 45% and cut costs by up to 30%. Subsequently, IIoT continues to rapidly transform manufacturing from a machine-focused industry to a data-focused one. This shift is largely due to the introduction of three pillars:
- Connectivity: IIoT involves connecting industrial devices, equipment and systems to a network infrastructure, allowing them to communicate with each other. This connectivity enables the collection and sharing of data in real time.
- Data collection and analysis: IIoT devices generate and collect vast amounts of data. This data can be analyzed using advanced analytics tools to extract valuable insights, optimize processes and make data-driven decisions.
- Automation and control: IIoT enables automation by connecting sensors, actuators and other devices to control systems. This can lead to more efficient and precise control of industrial processes.
A Reference Architecture for Industrial IoT
IIoT is a generous domain comprising several sub-sections, so manufacturers implementing an IIoT architecture should have a clear overview of the landscape to help control the long-term cost and complexity of their IIoT projects.
To that end, we present the following reference architecture for IIoT applications that use streaming data for swift, data-driven decision-making.
The following table summarizes the key components of this solution and their responsibilities.
|Programmable logic controller (PLC) devices and IoT sensors on machines
|Emit telemetry data
|Telemetry data ingestion and event-driven workflows
|Apache Flink cluster
|For stateful stream processing and streaming ETL (extract, transform, load)
|Machine learning models
|For predictive analysis of telemetry data
|Time series database
|Equipment monitoring and running diagnostics
|Trigger automated business workflows deployed
|Data lake and warehouses
|Keeps cold industrial data that can be used for experimentation and process optimization
|Line of business applications
|Internal business systems such as Inventory, supply chain, etc.
Before I explore each component in detail, allow me to introduce Redpanda, which serves as a hub connecting the data flow across different components.
Redpanda as the Central Data Hub
Redpanda is a simple, powerful, and cost-efficient streaming data platform that’s fully compatible with Apache Kafka APIs while eliminating the usual Kafka complexity. Designed to be an “easy button for streaming data,” Redpanda is free from external dependencies (like JVM or KRaft) and comes with a human-friendly command-line interface (CLI) and a rich web user interface (UI) that greatly simplifies working with real-time data.
So why use Redpanda in an IIoT architecture? Collecting data from the high-volume streams in a central location enables downstream applications to efficiently consume it from a single location without point-to-point integration channels.
As a central data hub for these data streams, Redpanda enables scalable real-time data ingestion from machines and provides durable data retention until downstream applications consume it. It also decouples data producers from consumers, allowing them to scale and evolve independently.
Redpanda’s lean, cost-efficient design consumes a third of the resources of JVM-based alternatives, such as Kafka. This lean infrastructure footprint is particularly useful for manufacturing plants that need to deploy real-time streaming data solutions within resource-constrained environments, like edge devices. In addition, Redpanda’s Tiered Storage offloads older data into streamlined cloud object stores, like Amazon S3, significantly lowering telemetry data retention costs.
Now that we understand the heart of our architecture, let’s move onto how the surrounding components contribute to the three pillars of IIoT.
Connectivity and Communication
The first step in an IIoT-enabled environment is to establish communication interfaces with the machinery. In this step, there are two primary goals: read data from machines (telemetry) and write data to machines (control and automation)
Machines in a manufacturing plant can have legacy/proprietary communication interfaces and modern IoT sensors. Most industrial machines today are operated by programmable logic controllers (PLC). A PLC is an industrial computer ruggedized and adapted to control manufacturing processes—such as assembly lines, machines, and robotic devices — or any activity requiring high reliability, ease of programming and process fault diagnosis.
However, PLCs provide limited connectivity interfaces with the external world over protocols like HTTP and MQTT, restricting external data reads (for telemetry) and writes (for control and automation). Apache PLC4X bridges this gap by providing a set of API abstractions over legacy and proprietary PLC protocols.
PLC4X is an open-source universal protocol adapter for IIoT appliances that enables communication over protocols including, but not limited to, Siemens S7, Modbus, Allen Bradley, Beckhoff ADS, OPC-UA, Emerson, Profinet, BACnet and Ethernet. The biggest advantage of PLC4X is that it provides a Kafka Connect connector. This allows applications to read from and write to PLC devices as if using databases over JDBC.
Aside from PLCs, modern machines are also equipped with IoT sensors that communicate via the MQTT protocol, making it possible to use MQTT sink and source connectors for data exchange.
Data Collection and Analysis
Regardless of the communication mechanism used above, data collected from machines are ingested into Redpanda in real time for downstream consumption. This telemetry data carries a rich set of information, such as:
- Operational metrics – Runtime duration (the total time the machine has been in operation), start and stop times
- Performance metrics – Speed and revolutions per minute (RPM), throughput and efficiency
- Health metrics – Temperature, pressure, vibration and noise levels
- Resource utilization – Energy consumption, material usage
- Faults and alarms – Error codes and warning messages
After ingesting into Redpanda, telemetry data feeds must be cleansed and normalized to produce the output formats that downstream applications expect. This involves event filtering, protocol transformation, enrichment and aggregation for analytics. A stateful stream processor, like Flink, can be employed for this purpose as it provides native integration with Redpanda as a data source.
Processed data can then be ingested into a time series database, such as InfluxDB or Prometheus, for time series processing, visualization and interactive analysis. That includes real-time analytics use cases like:
- Remote equipment monitoring: Monitoring telemetry data to detect and respond to faults or abnormalities in real time. This minimizes the impact of failures by triggering alerts and allowing rapid intervention.
- Predictive maintenance: Analyzing real-time telemetry data to detect anomalies or patterns indicating potential equipment failures. This enables proactive maintenance, reduces downtime and extends the lifespan of machinery.
- Energy optimization: Continuous monitoring of energy consumption based on real-time telemetry data to identify opportunities for energy savings and optimize resource utilization.
At the same time, telemetry data feeds can be routed to destinations like data warehouses and data lakes for offline use cases, like regulatory reporting, ad-hoc exploration and machine learning workloads. These use cases include but are not limited to using historical telemetry data for:
- Training machine learning (ML) models for predictive maintenance or anomaly detection.
- Root cause analysis to determine the root causes of failure.
- Historical performance analysis to identify trends, patterns and performance benchmarks.
- Regulatory reporting to generate reports for compliance within industry regulations.
Appropriate sink connectors deployed in Kafka Connect can route telemetry data ingest into Redpanda Cloud, which provides built-in sink connectors to destinations like Amazon Web Services (AWS) S3, Google Cloud Storage, Google BigQuery, Snowflake and many more.
Automation and Control
Automation and remote control of machine operations increase the efficiency of a factory floor by eliminating operations that require manual human intervention.
IIoT devices can publish events to Redpanda topics, triggering automated responses and control actions in real time. Events published to a control topic in Redpanda can trigger business processes deployed in stateful workflow engines, such as Camunda, jBPM and Activiti.
This enables seamless flow of data between IIoT devices and line of business applications to implement use cases, such as:
- Remote control of machines: Machines deployed in hazardous work environments can be remotely controlled and monitored through connected IIoT devices. That way, humans can be entirely spared from potential harm.
- Factory floor automation: Production processes can be scheduled, controlled and monitored to reduce manual intervention. For example, when the raw material usage reaches a certain threshold, purchase orders can be automatically sent for inventory replenishment.
From Reference Architecture to Robust Industrial System
Manufacturers with IIoT systems in place or are planning to adopt IIoT can use this reference architecture as a blueprint to push innovation, adaptability and continuous improvement in the industrial landscape. However, taking them from paper to practice relies on your streaming data setup and available resources to run them.
Redpanda provides a powerful and easily-scalable platform for handling real-time telemetry data generated by IIoT devices—and is available as a fully-managed cloud service or self-hosted platform.