The Growth of State in Kubernetes
I first worked with Ondat (then StorageOS) back in January 2017 and, at the time, I thought that Kubernetes storage was going to be a solved problem within 18 months. In hindsight, that was a little optimistic. We are now five years on, and the framework I set out for looking at cloud native storage issues (presented here at KubeCon in Austin) continues to be very popular, so many organizations are still grappling with how to deliver persistence in Kubernetes.
I recently took on the role of an advisory board member at Ondat. What drew me to Ondat was the vision that users could take the benefits containers and Kubernetes delivered for stateless applications and realize all of these for their stateful workloads. Given the central role of stateful applications in most business solutions, these improvements in reliability, scalability, automation and more agile development are significant. For me, this was, and still is, a massive opportunity.
To Stateful Kubernetes
Kube and containers were based on the idea of stateless workloads, so running anything stateful breaks one of the fundamental underlying assumptions. Just like Kubernetes, I started at Google, and at Google, persistent storage is a solved problem. They have no worries about running storage in a cloud native environment.
But Google mostly does not use traditional SQL storage — they mostly roll their own distributed databases and key-value stores. By contrast, most companies running stateful applications will be using standard, relational databases. To consider a new platform like Kubernetes for stateful applications, these users need to run popular databases like Postgres or MySQL at scale and with high availability.
Within my recent role as VP Ecosystem at CNCF, I spoke to Kubernetes end-users daily, and it was clear that persistent storage was still an issue for many of them. Where organizations use relational databases to underpin stateful applications running on Kubernetes, a significant proportion still rely on managed database services such as Amazon’s RDS, or they run the databases outside of Kubernetes entirely.
Even with the production release of the CSI driver, users may be hesitant to deploy critical applications in Kubernetes. These solutions fail to capture many of the benefits and much of the potential of true Kubernetes-native stateful development. Instead of effectively engineering a safe way for data to live within ephemeral nodes and containers, users are pulling storage back outside of Kubernetes. This separates and duplicates the task of ensuring resilience for compute and data. It places major ceilings on database and application performance. Perhaps most significantly, it limits the ability of Kubernetes, specifically the scheduler, to effectively deliver core compute features around workload efficiency and high availability. The nodes where external storage is attached become “pets” not “cattle” (apologies to any vegans, but it is still the best metaphor).
With managed databases in particular, there are even greater implications of taking the safe and easy route to deliver stateful applications. Cost is clearly the most obvious. When I wrote my “10 Trends and Predictions for Cloud Native in 2021,” the rise of FinOps was an easy choice. If anything, the importance of cloud cost management has grown more than I expected.
At its most basic, FinOps is about managing and reducing cloud costs, which makes managed database services such as RDS an obvious target. Though FinOps is evolving beyond this into a craft, the leading FinOps practitioners are increasingly exploring how cloud native environments can be leveraged more effectively to optimize overall IT costs, with storage becoming one of the critical elements.
Organizations need to maintain complete control of their storage. Installing a new database in Kubernetes is now straightforward, but the Day 2 operations are not understood. Ongoing database and storage maintenance, upgrades, doing rollbacks and more can all be complex. What seems like straightforward architectural decisions can significantly affect cost, resilience and scalability further down the line.
In many cases, cloud storage services and managed database services are an appropriate solution, but it’s worth considering the trade-offs and impact on storage control and lock-in upfront.
Looking Toward Application Portability
Another of my 2021 predictions was the maturing of cross-cloud and cloud portability, an issue I recently explored in more depth with IBM’s Mo Haghighi. Application portability is vital for any organization to negotiate effectively with cloud providers and, therefore, essential to FinOps. Storage lock-in, especially managed database services, severely impacts any organization’s ability to move applications between clouds. While many of the end-users I have spoken with aspire to run multicloud for resilience and use cloud-specific services, multicloud storage is still a challenge.
You can deliver multicloud stateful applications. You can deliver storage that resides and operates safely within Kubernetes. There is a strong need for the industry to evolve better tools and especially best practices, but the solution in both cases is a Kubernetes-native data layer that keeps you in control of your storage.
Sign up for the Ondat tech preview to learn how Ondat can help you scale persistent workloads on Kubernetes.