Maintaining Data Resiliency in the Age of Kubernetes
New approaches to applications call for new approaches to data management. Ten years ago, many organizations could stay on top of their data with traditional databases, data warehouses and data recovery methodologies. Today, data is being tightly coupled with applications through Linux containers. Increasingly, those containers are being orchestrated through Kubernetes, which, according to the latest annual survey by the Cloud Native Computing Foundation (CNCF), is being used in production by 83% of respondents.
Does that mean that if you’re using Kubernetes, you need to rethink your approach to data resilience and recovery? Not quite. In fact, you can build on existing best practices and lessons learned to keep your data readily available and protected in case of an adverse event even if you’re using Kubernetes.
Let’s look at why that is and four things you should do to optimize data resilience in your Kubernetes environment.
Out with the New, in with the Old
Kubernetes delivers new levels of efficiency through microservices and automation. It enables you to get your software to market faster and with greater agility, giving you a competitive advantage.
Kubernetes also relies on persistent storage volumes to save and share data between containers. Persistent storage addresses the short-lived or temporary nature of containers by storing data within the containers even after the container has wound down so that the data remains intact. That might make you think you need to throw out everything you know about disaster recovery and business continuity. But the truth is, the data resilience knowledge and capabilities of your IT managers, data storage experts, backup administrators and application developers still apply. You simply need to add Kubernetes to your data resilience portfolio alongside the cloud, bare-metal servers and whatever other environments you support.
In fact, a key advantage of open source technologies like Kubernetes is that they’re designed to support changing needs without you having to reinvent the wheel when a new technology comes along. What’s more, many of the latest container storage platforms include built-in data resilience functionality for cloud native workloads. These capabilities enable you to extend existing data protection solutions and infrastructure for these workloads across hybrid and multicloud environments.
4 Data Resiliency Best Practices
Once you recognize that Kubernetes doesn’t require you to completely rethink data resilience, the question becomes: What essential best practices for data availability do you need?
Whether or not you’re using Kubernetes, optimizing data resilience starts with understanding your organization’s unique business-continuity needs. It continues with investing in the level of data protection and disaster recovery appropriate to each workload. That investment involves four levels of data protection:
- Snapshots: Point-in-time snapshots are your first line of defense. They’re useful prior to upgrades, for example, because they enable you to quickly and easily restore systems to a prior state.
- Backup and restore: Strategies such as backing up data to the cloud or taking advantage of cloud providers’ object storage services enable you to access backup data quickly when you need it. Effective data protection APIs should let you correctly restore data and applications that run in container pods.
- Disaster recovery: Disaster recovery is necessary whether you face a relatively small mishap, such as losing a rack in a data center, or a major event such as an earthquake. Typically, you need to replicate data in geographically separated locations. Note that stateful applications require more sophisticated disaster recovery than stateless applications.
- Business continuity: The most mature level of data protection can also be the most costly. Investment in business continuity is appropriate for mission-critical applications, such as those that handle customer order entry or financial transactions. There are two options to consider: synchronous mirroring and asynchronous replication.
Synchronous mirroring is appropriate for the most critical data you absolutely can’t lose. With this approach, each time data is written to a local disk, it’s also written to a remote disk. If confirmation of the change isn’t received from the remote site within a tight time tolerance, the system retransmits the change or declares a failure. The process takes place as quickly as possible over a fiber-optic network.
Asynchronous replication is better suited to important but less-critical workloads. In this case, the system writes data to local storage first, then replicates it at predetermined intervals to a remote site.
Kubernetes can help your organization become more nimble and competitive. Even better, it doesn’t require you to overhaul your data resilience strategy. But you do need to make sure you have the right level of resilience for every application in your containerized environment. Fortunately, you probably already know how to do that.