5 Ways Data Protection for Kubernetes Is Different
What are the differences between Kubernetes data protection and the more traditional data protection offerings, even though such products deal with virtualization and cloud scenarios as well as traditional on-premises data protection?
Here are five salient differences between such products — differences that go beyond data protection, highlighting some of the fundamental strengths of cloud native computing generally.
Difference No. 1: Data Protection Centering on Metadata vs. Data Protection Centering on Data
At the heart of data protection lies backup and restore functionality. There is more to data protection than these two capabilities, but without backup and restore, there is no protection whatsoever.
In traditional environments, which for the purposes of this article include virtualization and cloud as well as various on-premises environments, backup and restore are focused on persistent data and the storage that contains it.
Kubernetes data protection, in contrast, focuses on metadata as well as the underlying data.
Kubernetes is essentially a declarative, configuration-based container orchestration platform. Providing data protection to those configurations and other metadata, including resource definitions, Helm charts and other files is central to the Kubernetes data protection challenge.
Difference No. 2: Dynamic Policies for Auto-Discovered Applications vs. Static Policies for Predefined Applications
Setting up a traditional data protection application consists of establishing a set of backup and restore policies that apply to the various resources the organization wishes to protect. Such policies generally center on snapshots and backup schedules.
In Kubernetes, applications and their microservices components are inherently ephemeral — scaling up and down at a moment’s notice, occasionally appearing and disappearing altogether.
A Kubernetes data protection product like must therefore auto-discover applications on the fly to know what data and metadata to protect. The policies that drive such protection must correspondingly be dynamic as well.
Dynamic policies exist at an abstraction layer above static ones, and the underlying technology must interpret them in real time in order to apply them properly in each situation.
Difference No. 3: Dynamic, Policy-Driven Automation vs. Static, Manually Configured Automation
When the Kubernetes environment interprets and applies policies, what it’s really doing is automating workflows that those policies specify.
Traditional data protection technologies also feature policy-driven automation, but those automations are as static as the policies themselves.
Kubernetes thus requires a rethink of what automation means — instead of a simple flowchart of “do this, make a decision, and do that” logic, cloud native automation is inherently dynamic, with logic that might change from moment to moment.
This revamped notion of automation applies to data protection as well as other Kubernetes automation scenarios.
Difference No. 4: Application-Specific vs. Volume-Specific Data Protection
Because traditional data protection centers on data and storage, operators logically focus on backing up and restoring databases and storage volumes.
Volumes, in fact, are the common denominator for all traditional data protection, since backing up and restoring them means backing up and restoring anything stored or installed on them, including databases, files or application components.
Kubernetes, in contrast, maintains a comprehensive, declarative abstraction of the entire persistence tier.
Kubernetes applications are fundamentally stateless, given the ephemerality of their components and the stringent performance requirements that apply to such applications.
Nevertheless, Kubernetes applications must typically maintain state without affecting these core characteristics. The platform overcomes this challenge via configuration-based abstractions.
This abstraction-based state management means that Kubernetes data protection cannot take place at the volume layer. It must take place at the application layer instead because only applications know what data they need and when. Details about storage have been fully abstracted away.
Difference No. 5: Application Recovery vs. Data Recovery
The most important principle of data protection is that your backups are only as good as your ability to recover from them.
While data recovery is most of the traditional data protection recovery story, Kubernetes recovery involves a complicated combination of the recovery of the data, resource and configuration components that make up a running application.
Such recovery involves the automated orchestration of several dynamic, policy-driven tasks – a tricky proposition given that the goal of such recovery isn’t simply the avoidance of loss of data, but rather, the continuous execution of applications in production while minimizing any adverse impacts on the users of those applications.
Once you understand the full complexity of such automations, it becomes clear why Veeam acquired Kasten. Delivering on the full Kubernetes data protection value proposition, as the Kasten K10 data management platform does, is no simple task.
The Intellyx Take
Highlighting the differences between traditional and Kubernetes data protection inevitably highlights the differences between traditional and cloud native computing.
Cloud native infrastructure requires a comprehensive declarative abstraction layer that abstracts storage and data, and enables stateless application behavior while managing state.
An important benefit of this cloud native approach is a clear distinction between control and data planes — the configurations that drive the behavior of applications vs. those that concern the movement of data.
Data protection must work at both layers: firstly, moving data as part of the backup and restore processes, and secondly, the broader data protection story that takes place as part of the automated orchestrations that drive application behavior.