Managing stateful workloads in Kubernetes is fundamentally different from deploying and scaling stateless microservices. In a previous article that discussed different approaches to running stateful workloads in Kubernetes, I explained Container Attached Storage (CAS) as one of the emerging choices for managing stateful workloads. CAS is fast becoming the preferred choice for running durable, fault-tolerant stateful applications in Kubernetes.
The OpenEBS project, which is a part of the Cloud Native Computing Foundation Sandbox, attempts to bring CAS to the Kubernetes platform. It can be easily deployed in clusters running on-premises, managed CaaS clusters in the public cloud, and even in air-gapped clusters running in an isolated environment.
This article introduces the architecture of OpenEBS and various options it provides to configure storage backend for cloud native applications.
What is Container Attached Storage?
Storage in Kubernetes is typically maintained outside of the cluster environment. Whether it is a shared filesystem such as NFS and GlusterFS or block storage like Amazon EBS, Azure Disks, GCE PDs, storage is exposed as an external resource. In many cases, storage is tightly integrated with individual nodes as an OS kernel module. Even the Persistent Volumes (PV) are tightly coupled with the underlying modules making them monolithic and legacy resources.
CAS enables Kubernetes users to treat storage entities as microservices. CAS has two elements — the control plane and the data plane. The control plane is deployed as a set of Custom Resource Definitions (CRD) that deal with the low-level storage entities. The data plane runs as a collection of Pods close to the workload. It is responsible for the actual IO transactions translating to read and write operations.
The clean separation between the control plane and the data plane delivers the same advantages of running microservices on Kubernetes. This architecture decouples persistence from the underlying storage entities increasing the portability of workloads. It also brings scale-out capabilities to storage enabling admins and operators to dynamically expand the volumes along with the workload. Finally, CAS ensures that the data (PV) and compute (Pod) are always co-located in a hyper-converged mode to deliver the best throughput and fault tolerance.
OpenEBS is a well-architected solution based on the principles of CAS. Let’s take a closer look at the architecture.
OpenEBS, like other platforms such as the Istio service mesh, follows the best practices of cloud native design and architecture. It has a control plane, data plane, and tooling that integrates with standard command-line interfaces such as kubectl and helm.
The OpenEBS control plane lives close to the storage infrastructure. It manages the lifecycle of storage volumes coming from SAN or block storage attached to each node of the cluster. The control plane is directly responsible for provisioning volumes, initiating snapshots, making clones, creating storage policies, enforcing storage policies, and exporting the volume metrics to external systems such as Prometheus. An OpenEBS storage administrator deals with the control plane to manage cluster-wide storage operations.
The data plane runs close to the workload that stays in the volume IO path. It runs in the user space while managing the lifecycle of PV and PVCs. The data plane provides a choice of storage engines with varying capabilities.
OpenEBS Control Plane
The OpenEBS control plane is exposed to the outside world through an API server. It runs as a Pod exposing the REST API for managing resources such as volumes and policies. A YAML file containing the declaration is first submitted to the API server which will initiate the workflow. API server makes calls to the API server in the Kubernetes master to schedule volume Pods in the data plane.
The provisioner component of the control plane implements dynamic provisioning through the standard Kubernetes external storage plugin. When an application creates a PVC based on an existing Storage Class, the OpenEBS provisioner creates a PV based on the primitives mentioned in the Storage Class and binds it to PVC.
The Node Device Manager (NDM) is an important component of the OpenEBS control plane. Each node in the Kubernetes cluster runs an NDM DaemonSet which is responsible for discovering the new block storage devices and if they match the filter, it reports it to the NDM operator to register that as a block device resource. NDM acts as the conduit between the control plane and the physical disks attached to each node. It maintains the inventory of registered block storage devices in the etcd database which is the single source of truth for the cluster.
OpenEBS Data Plane
If the control plane stays close to the physical storage and the Kubernetes master, the data plane lives close to the workloads and applications running in the Nodes. It runs the storage engine which is exposed to the Pods.
A storage engine is the data plane component of the IO path of a persistent volume. In OpenEBS architecture, users can choose different storage engines for different workloads based on the characteristics and configuration policy. For example, a different storage engine can be chosen for a high IOPS-based, highly available database workload from a read-heavy, shared CMS workload.
The OpenEBS data plane offers three choices of storage engines: cStor, Jiva, and Local PV.
cStor is the preferred storage engine of OpenEBS. It’s a lightweight and feature-rich storage engine meant for HA workloads like databases. It provides enterprise-grade features including synchronous data replication, snapshots, clones, thin provisioning of data, high resiliency of data, data consistency and on-demand increase of capacity or performance. cStor’s synchronous replication delivers high availability to stateful Kubernetes Deployments with just a single replica. When the stateful application desires the storage to provide high availability of data, cStor is configured to have 3 replicas where data is written synchronously to all the three replicas. Since data is written to multiple replicas, terminating and scheduling a new Pod in a different Node doesn’t result in data loss.
Jiva is the first storage engine that was included in the early versions of OpenEBS. Jiva is the simplest of the available choices, which runs entirely in the user space with standard block storage capabilities such as synchronous replication. Jiva is ideal for smaller applications running on nodes that may not have the option to add additional block storage devices. So, it is not suitable for mission-critical workloads that demand high performance or advanced storage capabilities.
OpenEBS’ third and simplest storage engine is Local Persistent Volume (Local PV). A Local PV represents a local disk directly-attached to a single Kubernetes Node. With the plugin, Kubernetes workloads can consume high-performance local storage using the familiar volume APIs. OpenEBS Local PV is a storage engine that can create persistent volumes or PVs out of local disks or host paths on the worker nodes. The cloud native applications that do not require advanced storage features like replication or snapshots or clones can rely on Local PV. For example, a StatefulSet that handles the replication and HA by itself can configure a Local PV based on OpenEBS.
In addition to the above storage engines, the OpenEBS community is also building additional storage engines that are currently in alpha. For instance, the forthcoming MayaStor data engine; a low latency engine, written in Rust, is a few months away and is targeted towards applications that require near disk performance as well as API access to block storage. And a variant of Local PV called ZFS Local PV has been gaining some adoption that addresses the concerns around Local PV like providing RAID functionality, Local Snapshot and Clone support.
Refer to the OpenEBS documentation for a detailed comparison and preferred use cases for each of the storage engines.
OpenEBS extends the benefits of software-defined storage to cloud native through the container attached storage approach. It represents a modern, contemporary way of dealing with storage in the context of microservices and cloud native applications.
In the next part, I will walk you through the steps involved in configuring and deploying OpenEBS in Amazon EKS. Stay tuned!
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.
The Cloud Native Computing Foundation is a sponsor of The New Stack.