Storage Considerations for Container Migration

One of the benefits of moving to Kubernetes is that your applications can run in a highly scalable environment. You can add additional pods quickly if you suddenly need more capacity, and then get rid of them when you don’t need them anymore.
Historically, when you terminate a container in a stateless application world, everything within it is destroyed as the resources are released. But, what if you are running a stateful application? In this case, dedicated storage to support transaction history is required.
To provide support for stateful applications, Kubernetes offers volumes. In planning and executing a migration to a Kubernetes environment it is important to understand the technical details for how to use and manage volumes within your Kubernetes deployment.
The first adopters of Kubernetes used it to deploy stateless applications. Originally, containers were built to be stateless, as this suited their portable, flexible nature. But as containers have come into more widespread use, people began containerizing (redesigning and repackaging for the purposes of running from containers) existing stateful apps. This gave them the flexibility and speed of using containers, but with the storage and context of statefulness.
For example, now many organizations deploy Kubernetes clusters to their on-premise data centers, public and private cloud providers, and/or hybrid environments. It has also become popular in edge computing, because deploying a light-weight application at the edge allows providers to capture the cloud’s convenience and improve performance by moving the processing closer to the consumer.
As the various ways of deploying Kubernetes have increased, so has the need to provide transaction storage. Determining how to combine ephemeral containers with persistent storage is a complex problem, but various solutions have been proposed, tested, and refined, despite several challenges.
One reason is that the requirements of persistent data stores vary based on the application. Some storage is meant for long-term use, which means that it can be slower; for example, long-term storage used to archive scientific research results where access time is not critical. Other storage needs a performant response time; for example, real-time retail customer database transactions.
Finally, there is also storage that requires a high degree of resiliency; for example disaster recovery storage for a financial institution. These varying storage requirements have created the need for a generic solution in Kubernetes that allows each of these storage types to be accessible without introducing a level of complexity that would make it unworkable.
At the core of the persistent storage solution is the concept of PersistentVolumes (PV) and PersistentVolumeClaims (PVC). A PV defines a unit of storage within the cluster. It is an API object with configurable characteristics, not unlike the rest of the Kubernetes ecosystem. The PVC is a request for access to a defined class of storage. Cluster administrators define PVs, which are then available as static resources within the cluster. If a PVC requests a resource type that isn’t available, the cluster may dynamically provision a persistent volume to meet the need.
More recently, Kubernetes added support for the Container Storage Interface (CSI), which builds on the PV/PCS model. CSI is a plugin that adds the ability to connect your cluster with block and file storage systems. It’s an extensible interface that can connect Kubernetes cluster resources and storage resources. Using CSI, Kubernetes can access AWS EFS, AWS EBS, Azure Disk, Azure File, IBM Block storage, and other similar resources.
Let’s explore how to meet these needs.
High Availability
When we speak about availability, we’re talking about having storage resources ready when we need them — and that also means that highly available storage even if some of the infrastructure that supports it fails. We’re also considering data persistence and disaster recovery, along with backup processes and procedures that allow you to recover from more extensive outages and disasters.
Scalability That Ensures Transfer Speed
Transfer speed is closely related to scalability. As an application is scaled up, so is the quantity of data that it produces and consumes. If your storage cannot scale alongside your application, you’ll quickly discover that this will prevent you from reaching your full potential and providing the highest quality user experience.
Data Security
The security of your data is perhaps the most important thing — but security is a double-edged sword. You want authorized users and resources to be able to efficiently access the resources they need while simultaneously preventing unauthorized access. Identity management systems can be incredibly complex, and they need to strike a balance between being granular enough to secure each component of an extensive system, and being straightforward enough to be easy to manage and implement.
The characteristics of a best-in-class storage provider will be similar, no matter where your storage resides. Native cloud storage is no different. Your storage solution needs to be available when your application requires it, to be able to transfer data with the speed necessary for your application, and to secure potential threats from inside and outside your organization.