Cloud Native / DevOps / Kubernetes / Contributed

Specialization Is Kubernetes’ Next Frontier

13 Jul 2020 11:44am, by

Niraj Tolia
Niraj Tolia is the CEO and co-founder at Kasten and is interested in all things Kubernetes. He has played multiple roles in the past, including the Senior Director of Engineering for Dell EMC's CloudBoost family of products and the VP of Engineering and Chief Architect at Maginatics (acquired by EMC). Niraj received his Ph.D., MS, and BS in Computer Engineering from Carnegie Mellon University.

You wouldn’t harness a mule to pull a heavy-duty tractor. In the same way, you should not expect to use technologies built for legacy systems to support today’s dynamic Kubernetes environments.

Kubernetes has quickly become the enterprise platform of choice for all sorts of cloud native applications and infrastructure. There are many good reasons for its popularity. Cloud native technologies, Kubernetes in particular, are central to the development and delivery of advanced applications and services that fuel today’s digital economy.

In addition, because of the frequent involvement of both relational and non-relational databases, such as NoSQL, in supporting these applications, environments and services using Kubernetes, many developers also rely on it for its storage capabilities. Kubernetes takes away a lot of the pain of ensuring high availability and scalability of applications and services, but these benefits, unfortunately, do not extend to data. In Kubernetes environments, data management must be a critical priority but legacy data management technologies, which would cover operations like backup/restore, disaster recovery and application migration, are outclassed by the inherent agility, scalability and performance of cloud native systems.

Because of the rapid application growth and increased production deployments at scale, many enterprises have to maintain a focus not just on the application development lifecycle, but also “Day 2” operations and the challenges of applications and services in production. These include data management, security, and observability. Although Kubernetes’ capacity for data replication and portability can enhance a system’s reliability, it doesn’t protect developers and operators against infrastructure failures, data corruption, or data loss. In fact, if there is a coding error that accidentally leads to a deleted database, that error will be faithfully replicated along with everything around it, leading to further data loss. Without a separate and appropriate data management system in place, an enterprises’ high-value data, and the applications and services that rely on it, can remain exposed, creating the widespread potential for business risk.

A recent study by the Cloud Native Computing Foundation revealed that more than 40% of its survey respondents were using Kubernetes for storage, and 55% of the remaining respondents were planning to do so. These dangers are more than theoretical — there are far more at-risk workloads and datasets running on Kubernetes today than most IT folks realize. As that trend continues, opportunities for the introduction of risks multiply.

Of course, the temptation to delegate backup and restore responsibilities to legacy tools built for legacy infrastructure is understandable; it’s worked in the past, it’s immediately available, and it requires no additional investment. However, experience shows that when data is lost, the lengthy recovery time associated with error-prone legacy methods of data reconstruction adds significant cost and risk to an already stressful situation.

To protect business-critical data in cloud native, Kubernetes-orchestrated environments, two things are needed: A Kubernetes-native data management strategy — one specialized to work within scalable, agile and dynamic environments — and a comprehensive but redundant and separate copy of the full dataset maintained in an independent location. Kubernetes uses its own placement policy to distribute application components across all servers for fault-tolerance and performance. Further, different applications are often co-located on the same server. In comparison, traditional data management systems will fail as they can never independently capture the state of just a particular application without pulling in unrelated applications.

Kubernetes, at its core, is application-centric rather than infrastructure-focused.

Accordingly, Kubernetes orchestrates are constantly being rebalanced, so their IP addresses keep changing. A backup solution needs to understand and follow that pattern. But most legacy solutions operate with the assumption that definitions and addresses of applications and data are fixed and stable and can never independently capture the state of just a particular application without pulling in unrelated applications. A Kubernetes-native backup, on the other hand, would be capable of discovering and capturing dynamic applications and all their associated content wherever they are.

Kubernetes, at its core, is application-centric rather than infrastructure-focused. Because of this, it lends itself to high-velocity application development that supports the speed of digital innovation. As such, the DevOps philosophy adopted in parallel with Kubernetes cedes control over both infrastructure and deployments to the developer. This immense power, however, comes with the risk that even a simple configuration error, if left unchecked, could delete critical data and lead to business disruption.

This is why ensuring the continuity of Day 2 operations is so critical. To safeguard against these risks, appropriate backup is necessary. Systems that can automatically discover new as well as changed applications and do so without obliging developers to change either their work processes or their tools are a major value add. In Kubernetes environments, this is possible through native APIs that support security protocols, like authentication and authorization, and CI/CD and workflow integrations.

All of this reflects the fact that Kubernetes is a fundamentally different kind of computation platform. It is more complex than previous systems, unfamiliar to many IT people who grew up in conventional data center operations, and its administrative responsibilities are distributed in unfamiliar ways. Apart from the traditional reasons for data loss in a cloud environment, Kubernetes increases the risk of accidental data loss.

At the same time, however, there is a quickly growing ecosystem of technologies that enable Kubernetes environments to function as they do — delivering agility, performance and scalability — to help users gain the full value from their deployments. As a user, you must look at the specialized, Kubernetes-native solutions that are evolving rapidly alongside cloud native adoption or run the risk of leaving precious investments exposed.

The Cloud Native Computing Foundation is a sponsor of The New Stack.

Feature image by David Mark from Pixabay.

At this time, The New Stack does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email:

A newsletter digest of the week’s most important stories & analyses.