Using Data Fabric and Kubernetes in Edge Computing
KubeCon + CloudNativeCon sponsored this post, in anticipation of KubeCon + CloudNativeCon EU, in Amsterdam, Aug. 13-16.
A daunting challenge of edge computing is to deal with situations like this: petabytes of data in motion on thousands of clusters in different geographical locations. That description makes you think of large industrial use cases with IoT sensor data, but they are not the only place that edge matters. Edge computing is turning out to be important in an amazing range of industries. Banks do it, online service providers need it, and health providers, telecommunications, utilities and carmakers employ it.
This post presents the needs that motivate edge computing, so that readers can decide whether or not an edge design may prove useful for them. It also examines the challenges of edge — including challenges that often get overlooked — and provides some example solutions that have worked well in real-world situations.
Why Use Edge Computing?
Edge computing is an attractive (often even required) architectural option in situations where computations need to be done near the source of data, due to limited latency tolerance, or the risk posed by the potential for networking failures, or regulatory requirements, or simply in response to the scale of data being produced at the edge having exceeded the ability to transfer it to a central site economically.
The expanding adoption of edge computing is happening because of several factors:
- It is easier than ever to acquire data, due to decreased sensor costs and increased variety. Almost everything produced today has more sensors built-in and the ability to communicate data.
- Network infrastructure is making it cheaper to move bits to the edge.
- It is becoming easier to manage distributed computations in both core and edge clusters, thanks to advances in Kubernetes.
- Ruggedized hardware is available that can do the necessary filtering and pre-processing of data, and remain operational even in the harsh conditions that may occur at the edge.
Surprisingly, one of the big challenges in edge computing isn’t the computing part. It is the almost universal need for communication between edge and core. Metrics and diagnostic data need to move back to core computing centers and, in some cases, data or models need to move to the edge.
Architects and implementers often assume that communication with the core will be easy to handle. In fact, it is often not.
What Makes Edge Work? Communication with the Core
The following real-world use case illustrates how challenges in edge computing can be addressed, including communication with the core. A few years ago, we had a customer who had built a video streaming system. Edge computing was needed so that systems providing the video content were near end-users, to minimize latency and maximize uptime. The system our customer had built was working fine, but they had run into trouble building telemetric systems to carry data about the system’s health and customer video consumption back to the core — data necessary to alert the core support team if something went wrong, and for billing.
This particular problem was very specific, but the shape of the problem — edge computing plus telemetrics from the edge to core — is general to many other industries. In this example, as is common, telemetrics had been put off to the end of the project; and the difficulty had been substantially under-estimated. That was where we came in.
What We Did
To address this typical edge problem of getting data back to the core, we added a distributed data fabric and used its message transport capabilities to create a simple and reliable solution.
Our goal was to support data acquisition from dozens of miniature edge data centers and transport the resulting data reliably and with reasonably low latency to the core data center for analysis and billing. Temporary network partitions should not affect data integrity, and data should be transferred as soon as such partitions were healed. The system needed to minimize downtime from physical failures as well. Figure 1 illustrates how the data fabric was used to meet these requirements.
Previously, our customer’s developers had tried several telemetrics implementations unsuccessfully and were substantially behind schedule as a result. Our data-fabric-based design, in contrast, allowed a very simple transfer program to insert messages into a message stream at the edge. The data fabric then handled all data motion from edge to core. Topics in the message stream recorded the data center name, source machine name, and sensor name or event type, so that all data from all edge centers could be merged into a single message stream while still allowing analytics on any subset of the data. The data fabric handled security of data at rest and in motion.
This design was very easy to implement and operate, and quickly got the customer’s developers back on schedule. This design has also been very reliable over a period of several years. A particular benefit of this final data fabric design is a high degree of separation of concerns. Data acquisition at the edge, for instance, is independent of the central programs that process the data. The geography of the data fabric is specified entirely by an administrator, who now can focus on configuring data motion and access control and not be concerned about data content.
This separation of concerns means that programs that run at the edge and in the core can be much simpler, focusing on a single problem. This advantage applies across a wide range of use cases.
What Would We Do Differently Today (or not)?
If we were to architect this edge solution today, we would still use the data fabric to move data — keeping the benefits of having a data fabric to deal with all aspects of data security, data motion, replication and fault tolerance with high availability. But today we would also benefit enormously from a fully cloud native implementation for the central software, by using Kubernetes for container orchestration. Five years ago, container orchestration was quite primitive compared to where Kubernetes is today.
A data fabric, with the right capabilities, provides data access and state persistence for containerized applications running under Kubernetes.
A data persistence layer to complement Kubernetes’ role in orchestrating computation is essential to get the full power of cloud native computing.
Data will out-live the containers that process it.
On the edge clusters, however, we probably would change very little — at least not today. Tomorrow, however, might be a different story.
Remaining Challenges: What’s Coming?
We are on the cusp of having edge container orchestration that is seriously useful. This makes right now a brilliant moment for edge computing. Having real edge orchestration makes it feasible to do more advanced processing on the edge systems than we were able to justify in our original design, and makes provisioning of edge clusters easier.
The Edge as a Destination
This use case highlights the common edge problem of data ingress to the core. But what about the egress of data to the edge? We need egress to deploy AI/machine learning models to the edge, push status information or reporting data to edge or to update software running at the edge. A data fabric can make all of these fairly trivial while also minimizing security risks posed by accessing repositories over the open internet.
Having data move both ways enables us to truly “act locally, but learn globally.”
What About Security?
Security in edge systems is crucially important, because they have an enormous threat surface. At a minimum, it should be impossible to forge new edge clusters, to impersonate existing ones, or to eavesdrop on data moving from the edge to the core. Ideally, this level of security is based on some sort of silicon root of trust that extends all the way down to the lowest hardware level. It should also extend upwards, so that all OS and container images are validated before execution, and data is protected at rest or in transit without any need for specialized design of applications.
A secure container execution framework running on a secure hardware platform, together with a secure-by-default data fabric, can meet these needs. But anything short of that is likely to be significantly less secure.
- Far more industries need edge computing than many people think.
- Edge computing is not just about computing or running models at the edge; pulling metrics and operational data back to the core is an almost ubiquitous and commonly-overlooked requirement.
- A unified data fabric from edge to core handles the problem of reliable data motion to and from the edge.
- Kubernetes already provides huge benefits for orchestrating containerized computation at the core, and the ability to use Kubernetes at the edge is rapidly maturing.
Feature image via Pixabay.