Kubernetes and the Challenge of Adding Persistent Storage
Kubernetes adoption is exploding, but hype aside, Kubernetes remains very new — and has a long way to go before it ever might become an integral part of most IT infrastructures.
In the meantime, many, if not most, enterprises and IT shops are just looking to get their feet wet as they enter this brave new world of Kubernetes. And before developers can begin to do their work on the platform; the admins, operations teams and/or DevOps must lay the groundwork to add traditional, yet vital data management components to the mix: persistent storage is a good example of a necessary component in a Kubernetes deployment, while it is not always easy to implement.
Adding Persistent to Stateless
A key demand enterprises should have is that developers should be able to store data in Kubernetes clusters without having to worry about how persistent storage is working under the hood.
“Application developers and DevOps teams aren’t data management people. And just as I don’t want my DBA to play app developer, I really don’t want my app developers playing DBA,” John L. Myers, an analyst for Enterprise Management Associates (EMA), said. “I want to give the app developers the opportunity to access a persistent data management layer via API or not have to create one from scratch every time we deploy a containerized application.”
However, Kubernetes’ ephemeral structure makes it less than ideal for deployment on stateful volumes.
“Containers have never been designed to be persistent, but enterprises need the persistence to change loads, keep state, use more hardware for different loads, etc.,” Holger Mueller, an analyst for Constellation Research, said. “It’s kind of a design flaw of containers for the enterprise world, as everything of importance for the enterprise needs to be persistent.”
Still, Kubernetes offers a wide variety of options to set up persistent storage. Kubernetes, for example, provides a way to provision data both statically and dynamically on volumes mounted to the container orchestration platform, Suzy Visvanathan, director, product management, for MapR, said. “This gives a flexibility to users to consume storage as and when they require,” Visvanathan said.
However, the flex volume plugin Kubernetes offers so external vendors can integrate with Kubernetes has had its issues. One such problem was plugin dependencies, before the introduction of the Container Storage Interface (CSI) model, Visvanathan said. With Apache Mesos, Red Hat, OpenShift, Docker, and cloud solutions, such as the Amazon Elastic Container Service for Kubernetes (EKS) and Google Kubernetes Engine (GKE); orchestration layers now inherently integrate with Kubernetes internally.
“The CSI model has gone a long way in making it easier to integrate external storage solutions with Kubernetes,” Visvanathan said.
Many enterprises will invariably seek to extend Kubernetes to accommodate a larger number of users who share the container resources. However, scaling up the persistent storage component can pose challenges. Legacy storage solutions and their configurations, for example, are also not necessarily the best fit.
“Kubernetes enables easy and rapid scale-up of containers in production. The next challenge is to provide a data platform that scales along with it,” MapR’s Visvanathan said. “Legacy storage options are simply unable to keep up with that scale. Retrofitting old-style security in Kubernetes further exacerbates the situation”
New persistent storage alternatives are typically single-node solutions, and if clustered, still do not provide a global addressable namespace or hybrid/multi-cloud mobility, MapR’s Visvanathan noted.
“Kubernetes is used for cloud-style elasticity and the ability to choose the cloud model of your choice, including on-premise, hybrid or multi-cloud deployments,” Visvanathan said. “Persistent storage needs to deliver on the same goals”
“Scaling storage can be disruptive and can result in a big, and often unanticipated, incremental expense,” Roth said. “The primary challenges with persistent storage for Kubernetes are mainly rooted in scaling. A do-it-yourself approach to container storage infrastructure can be disruptive and expensive.”
Ultimately, organizations will eventually need to balance storage use across increasing sets of servers. Available resources to share the storage across the Kubernetes deployment can consist of network file system (NFS) or an open source clustered file system protocol, such as Ceph or GlusterFS, Roth said.
Despite the challenges mentioned above, Kubernetes does offer offers a wide breadth of volume options for storage integration. Improvements are also continually added. Kubernetes, for example, offers a way to provision data both statically and dynamically on volumes mounted to Kubernetes, Visvanathan said. “This gives a flexibility to users to consume storage as and when they require.”
Indeed, Kubernetes’ volume storage driver options can facilitate the use of a great many different mechanisms for persistent storage, Ash Wilson, strategic engineering specialist, for CloudPassage, said, adding some volume drivers are IaaS (Infrastructure as a service) platform-specific, while some are agnostic to the underlying cloud infrastructure.
However, caveats exist. “These volume storage driver options are great, but the time to properly evaluate and determine which best fulfill the application’s requirements may extend the time spent in the architectural process,” Wilson said. “The tasks associated with securing persistent storage in Kubernetes are oftentimes volume driver-specific, so security concerns and requirements must be taken into consideration during the architectural process.”
Going the Bare Metal Way
Kubernetes virtualization structure also lends itself particularly well to bare metal server deployments when setting up persistent storage. This is because containers virtualize operating systems, as opposed to virtualizing the hardware underneath the operating systems, MapR’s Visvanathan said.
“Containers, by their very nature, are an ideal way to deploy applications on bare metal, while Kubernetes simplifies the creation and management of containers,” Visvanathan said. “It’s a natural extension of this concept that the persistent storage for Kubernetes is also best served on bare metal. Virtualized servers don’t provide any additional benefits once containers are used for elasticity, so they are simply an additional layer with no added value.”
The dynamic and stateless nature of Kubernetes workloads has long posed difficulties when determining user access privileges to applications running in Kubernetes. In fact, the main security issue with storage for Kubernetes is maintaining access control and authorization privileges to ensure that only the relevant containers and services have access to sensitive data, Rani Osnat, vice president of marketing for Aqua Security, said.
“There are several ways to achieve this, but they must be tied to a security policy that has application context and ties privileges to a specific service,” Osnat said.
The security tools on offer for securing persistent storage for Kubernetes should also continue to evolve. In other words, everything will just get better in time.
“Ephemeral, short-lived workloads can present challenges in capturing the appropriate security information, especially for audit and compliance purposes. This requires an evolution in tooling to address without slowing down the agile process,” CloudPassage’s Wilson said. “Similar problems exist for tracking, auditing, and securing storage mechanisms — tools must evolve to address security needs without slowing down the application delivery process.”