3 Reasons to Bring Stateful Applications to Kubernetes
There are cloud native purists who believe that stateful applications do not belong in containers. Stateful applications — applications that retain persistent data from transaction to transaction — break the processes law of 12-factor applications. This law states that apps should be executed as “one or more stateless processes.” It led to many applications being refactored to move the responsibility of remembering the state to the client. For example, an e-commerce shopping cart app can use session cookies to store a shopper’s purchases until it is time to transact.
However, the 12-factor framework also states that “any data that needs to persist must be stored in a stateful backing service, typically a database.” So, when this shopping cart app needs to verify inventory before proceeding, it has to pull that data from a database somewhere.
Even though containers weren’t originally designed for databases, data analytics and data processing applications, these stateful application components can now be deployed into Kubernetes environments. And, while containers are still more likely to house stateless apps, there are three major reasons we’re seeing a rise in stateful apps in containers.
1. Everyone Benefits from Agility and Portability
Software developers were the first group to rapidly adopt containers, as a way to accelerate microservice application development. Being able to package microservices in containers made it easier to work on applications in a local environment and rapidly iterate on code. In contrast to legacy monolithic applications from the past, developers had a way to push code changes more frequently and deliver more features — without long delays for compiling and building applications. With the addition of Kubernetes as a standard orchestration tool, developers could also ship those applications to different environments, without worrying about compatibility issues and differences in infrastructure.
Today, containers and Kubernetes projects are being initiated by both developers and IT operations teams. In addition to developer agility, operations and SRE teams recognize the benefits of Kubernetes, including:
- Greater resiliency: Containerized applications can be rapidly restarted to resolve issues. If there are any software or hardware failures that affect a node, applications are simply restarted on a different node.
- Reduced problem resolution time: The immutability of containerized applications makes it simple to patch and update applications, or to roll them back to a previous working version.
- Improved automation: Kubernetes supports a declarative model, which allows it to scale more effectively with reproducible results. Built-in self-healing and API-driven interfaces allow for easier implementation of Blue-Green deployments.
- Greater portability: With Kubernetes being a widely adopted standard, applications are truly portable across different infrastructure; providing a common set of APIs across clouds and on-premises environments.
While stateless microservices made up the majority of early Kubernetes projects, all of the benefits listed above are valuable across all application types. Developers still need to iterate on database designs, and operations teams still want easy ways to update and rollback data processing applications and quickly recover from issues. As a result, over the last year, we’ve seen a rapid increase in tools and solutions to support stateful applications in Kubernetes, which has also encouraged more enterprises to containerize their stateful applications. In fact, a recent survey by 451 Research showed that a majority of enterprises (55%) agree that stateful applications make up more than half of all containerized applications. This is further expected to grow as more and more stateful applications are containerized.
2. Storage in Kubernetes Is Improving
The initial release of Kubernetes had limited support for complex stateful applications, but the Kubernetes community has been rapidly innovating in this area. Here’s a look at some of the key innovations that have made stateful applications possible, both within the Kubernetes framework and through extensions to Kubernetes.
From the beginning, Kubernetes supported persistent volumes through the PersistentVolume (PV) and PersistentVolumeClaim (PVC) APIs. A PersistentVolume (PV) is a storage volume that has a lifecycle independent of any individual Pod that uses the PV. These volumes are created by an administrator of the system and can be backed by a variety of storage systems, including Amazon EBS or NFS or Ceph.
A PersistentVolumeClaim (PVC) is the request for storage from the user. The request includes the size of the volume and the access modes required — either Read-Write to a single mounted node, Read-Write to many nodes, or Read-Only by many nodes.
Every PV is backed by some storage system. In the early days of Kubernetes, the interface to different storage infrastructure was handled through volume plugins. Different volume plugins were created to support different storage solutions; including each of the major public clouds, iSCSI and NFS. But the original architecture of these plugins required checking code back into the core Kubernetes project — each with its own unique requirements. In 2015, Diamanti contributed the FlexVolume plugin, which enabled third-party storage providers to present volumes to Kubernetes in a consistent way. This influenced the creation of the Container Storage Interface (CSI) in more recent years, allowing new storage solutions from different vendors to enter the market.
In most enterprise environments, different applications require different storage characteristics for price and performance reasons. In 2017, Kubernetes added the StorageClass object. A StorageClass provides a way for administrators to describe the “classes” of storage they offer and present different options to developers. Along with this concept came the idea of dynamic provisioning, where the system waits for requests of a particular persistent volume type and matches the PVC to available PVs. This delivered more flexibility to users to align applications to the best-suited storage type.
In the early days, while volumes could be persistent independent of Pods, it was still quite difficult to reattach storage volumes to Pods when they got restarted across different nodes in the cluster. In 2016, we first saw the alpha concept of “PetSets,” which then became StatefulSets when it was released in 2017. StatefulSets is a workload API object that maintains a sticky identity for each of the Pods to persistent volumes, so that you can reattach a volume to Pod that may be restarted on a different node. This development is very important to maintain state within a cluster, as an application like a database can now survive a Pod getting shut down.
Container Storage Interface (CSI)
As discussed earlier, volume plugins had not been scalable for a growing storage ecosystem, and so CSI was created to provide a common interface into Kubernetes. CSI became generally available in December 2018, giving third-party storage providers the ability to write plugins that interoperate with Kubernetes, without having to touch the core code. This has initiated the latest wave of innovation, as commercial vendors are able to introduce more advanced functionality into the market to support production deployments.
3. Stateful Applications in Kubernetes Are Now Production-Ready
Stateless and stateful applications have very different requirements for “production readiness” – the most important being how both the state and the data are protected and preserved.
In the case of stateless applications, any issue that arises — no matter if it is related to the node, the Pod, the networking, or even hardware failure — Kubernetes will simply stop the application and restart it somewhere else. This resolves a large number of the common types of problems that arise. This is possible because all containerized applications are backed by immutable image files and declarative YAML files that are typically stored in an artifact repository like Docker Hub, Artifactory or Harbor. As long as these files are intact, the same application can simply be restarted on different nodes in the cluster. Because this application has no state, the same application can also be started in an entirely different cluster in a different location, as long as that new cluster has access to these files as well. It doesn’t rely on any pre-existing data.
This is a very powerful benefit of Kubernetes, that allows stateless applications to be highly resilient and portable across different clusters and different infrastructures.
However, when you consider stateful applications like databases or AI/ML applications, this becomes much more complicated. Besides ensuring that the artifact repository is intact, we now have to ensure that the data itself is highly available and resilient. This requires thinking about all the different types of failure modes that may occur and having a full set of data services to address each type of failure mode.
Like traditional data center environments, these applications need integrated backup and restore capabilities, as well as volume snapshots, to survive the occasional disk failure or node failure. However, many organizations also want to protect against a rack failure, so the ability to stretch clusters across different availability zones is important. This can be accomplished via synchronous mirroring, whereby data is automatically replicated across nodes in a single stretched cluster. Finally, enterprises like banks and medical facilities also want to have site resiliency, which means having the ability to also send data to another location — through asynchronous replication and disaster recovery services.
Many of the complex stateful applications being deployed to Kubernetes are also I/O intensive. Data processing applications like Splunk or Elasticsearch, messaging applications like Kafka, and the aforementioned databases and AI/ML workloads, all put immense pressure on the system. To deliver production-level performance for these applications, enterprises also want to consider the performance of their chosen storage system. Low latency storage with guaranteed Quality of Service can often improve application performance and even reduce costs by being more efficient. For example, Splunk customers can ingest more data and gather more real-time insights with a more efficient storage system.
CSI and other developments of the Kubernetes storage ecosystem have seen a recent renaissance period, as more advanced cloud native storage solutions are introduced to deliver these advanced capabilities. Either by providing comparable data protection and data resiliency solutions that are common to today’s virtualized environments, or delivering high-performance, low latency storage options that can be leveraged natively in Kubernetes, the choices available to enterprises today make it possible to support even the most complex stateful applications in Kubernetes — with the same performance and resiliency they have in traditional environments, but with the added benefits of agility and portability.
The Next Challenge for Stateful Applications
So does this mean the job is done? Are stateful applications on par with stateless applications in Kubernetes? Not quite yet, but the gap is closing.
As mentioned earlier, a stateless application can be very easily restarted in a different cluster — it may even run in a different cloud, as long as the artifacts are available. This is still a challenge for stateful applications, where the data in the volume would need to be ported to a different cluster as well. Diamanti is addressing this challenge with Diamanti Spektra 3.0, which allows data to be replicated to other Kubernetes clusters — including cloud-based clusters.
It’s an exciting time to be in this space and there’s less reason to hold back from containerizing stateful applications today than ever before. Kubernetes is not just for the purists anymore.