Running Stateful Applications in Kubernetes: Storage Provisioning and Allocation
To appreciate how Kubernetes manages storage pools that provide persistence to applications, we need to understand the architecture and the workflow related to application deployment.
Kubernetes is used in various roles — by developers, system administrators, operations, and DevOps teams. Each of these personas, if you will, interact with the infrastructure in a distinct way. The system administration team is responsible for configuring the physical infrastructure for running Kubernetes cluster. The operations team maintains the Kubernetes cluster through patching, upgrading, and scaling the cluster. DevOps teams deal with Kubernetes to configure CI/CD, monitoring, logging, rolling upgrades, and canary deployments. Developers consume the API and the resources exposed by the Kubernetes infrastructure. They are never expected to have visibility into the underlying physical infrastructure that runs the master and nodes.
Developers “ask” for the resources they need to run their applications through a declarative mechanism, typically described in YAML or JSON. The Kubernetes master is responsible for ensuring that the appropriate resources are selected as requested by the developers. But before it can do that, the administrators will need to provision the required compute, storage, and networking capacity.
For example, a developer may ask Kubernetes to schedule a pod backed by SSD running powered by a certain number of cores and memory. Assuming that the infrastructure is capable, Kubernetes master honors the request by choosing the right node(s) to run the pod.
To understand this concept, let’s look at the relationship between a pod and node. Nodes are pre-provisioned servers configured by administrators and operations team. Developers create pods that utilize the compute resources exposed by the nodes.
This architecture of Kubernetes enables clean separation of concerns among developers, administrators, and operations.
Kubernetes Resource Provisioning and Allocation
Let’s now expand the analogy of a pod and node to storage. Before developers can start using the storage, administrators need to provision persistent volumes. Unlike volumes, persistent volumes are not associated with any specific pod or containers when they are created. They are pre-provisioned storage resources that can be used by developers during the creation of a pod. Once persistence volumes (PersistentVolume) are provisioned by administrators, developers create a claim (PersistentVolumeClaim) to start consuming the storage resources exposed as persistent volumes.
This is very similar to the relationship between the node and pods. Before deploying pods, developers assume that the nodes are provisioned and available. Similarly, before creating a claim, developers assume the availability of persistent volumes. Claims and persistent volumes are to storage what pods and nodes are to compute.
Source: Steve Watt, Red Hat
It’s also important to understand the difference between Kubernetes volumes and persistent volumes. Volumes are similar to Docker volumes in which containers request for persistence at runtime, which is provided by the host. Volumes can use implicit host-based storage or can explicitly request for persistence backed by external block storage devices and distributed file systems. But the developer doesn’t expect a storage resource to be pre-provisioned before using volumes. In persistent volumes and claims, there is a strict enforcement of resource utilization dictated by the policy defined during the creation of resources.
Traditional Kubernetes volumes can take advantage of persistent volumes. Pods can access storage by using the claim as a volume. Claims must exist in the same namespace as the pod using the claim. The volume is then mounted to the host and into the pod.
Terminology, Concepts, and Lifecycle
PersistentVolume (PV) is of networked storage resource of the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource.
PVs are volume plugins like Volumes, but they have a distinct lifecycle that is independent of any individual pod that consumes the PV.
A PersistentVolumeClaim (PVC) is a request for storage by a developer. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific storage size and access modes such as read and read-write.
PVCs and PVs can be matched through the concept of labels and selectors. During the creation of a PV, administrators can create labels with attributes. PVCs can use selectors to ensure that they are always bound to the matching PVs with matching labels.
PersistentVolume types are implemented as plugins. Kubernetes supports popular backends and distributed file systems including Amazon EBS, GCE Persistent Disks, Cinder, Azure File System, NFS, iSCSI, Gluster, and Ceph among other types.
A StorageClass enables administrators to create multiple tiers or classes of storage they offer. Different classes map to different levels of quality-of-service, backup policies, and arbitrary policies as defined by the administrators.
There are five phases in the life cycle of a persistent volume:
During the provisioning phase, an administrator creates a PV from an existing physical storage pool. Provisioning supports static and dynamic modes. When a developer requests a claim, Kubernetes looks for an existing PV that matches the requirement. This is called static provisioning. When none of the static PVs the administrator created matches a developer’s claim, the cluster may try to dynamically provision a volume for that PVC. This mode of provisioning is dynamically handled by Kubernetes, provided that the physical resources are available.
Binding is a phase where the claim gets bound to a specific persistent volume. By analyzing the PVC, Kubernetes will find a matching PV, and associates both of them.
After the PVC is bound to a PV, pods will start using claims as volumes. This is when the developer defines the mode (read or write) for accessing the volume.
When an application is done with using the volume, developers can delete the PVC objects through the API, releasing the claim. This step will initiate the reclamation process. But until the claim is deleted, the volume would not be freed up for others to use.
The last phase is reclamation, which is defined as a policy by the administrators. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Persistent volumes can either be retained, recycled or deleted. Depending on the storage backend, an appropriate action can be taken on the persistent volume. In case of using block storage, they may deleted by invoking the cloud-specific API. In scenarios where a volume was created from distributed storage, a simple scrub command may be issued.
In the next article in the series, we will take a look at Pet Sets, an evolving concept in Kubernetes for running highly available stateful workloads.