Databases and Kubernetes: Adopting a Distributed Mindset
Cloud native systems are, by definition, distributed — but to run databases securely and effectively on them, what’s needed is not only purpose-fit technology but a change of mindset, according to this podcast episode’s guests.
In this episode of the New Stack Makers podcast, Jim Walker, principal product evangelist and Michelle Gienow, senior technical content manager, of Cockroach Labs (and a former New Stack reporter), discussed how distributed systems create new challenges for databases, the paradigm shift that’s needed to run databases effectively on Kubernetes, and the results of a new survey of Kubernetes users.
The podcast was hosted by Heather Joslyn, features editor of The New Stack.
Essentially, there are three categories of how databases are used in the current cloud native ecosystem, according to Walker. First, there’s the “lift and shift” variety, in which a database is run on a container and a single service — a use case that aligns most closely with legacy systems.
Next, there’s “move and improve,” an approach that’s risen from the legacy past. These solutions take a layer of a legacy database system and rework it along cloud native principles. In these cases, most typically, the storage layer is also distributed, and data sharding might be automated.
Finally, there’s what Walker called “rearchitect and rebuild” an approach followed by Cockroach Labs and a few other companies. It’s based on the concept of rearchitecting a database system from the ground up to be cloud native.
“One of the biggest challenges is that not everybody thinks distributed yet,” Walker said. ”These distributed principles, and a distributed mindset are what’s helping people understand the true value of cloud native.”
The challenge of cultural change, which hinders so many organizations as they undergo a cloud native transformation, also applies to databases, echoed Gienow.
“A lot of people are excited about the new tech, but then they bring their assumptions about how things work and how they’ve always worked to this new technology,” she said. “And they are trying to work with it like this, tightly coupled to the linear way, and it just fails. Or if it doesn’t fail, they have to fight against their own process in order to use it.”
In such a situation, she added, “They’re not capturing the benefits they came for. They’re not capturing savings or speed, things aren’t faster. So I think there’s a lot of frustration until you can kind of cross that event horizon and just get it.”
Kubernetes (K8s) wasn’t originally set up for stateful workloads, the data-rich applications that save information from previous transactions. And yet a study released in August by Cockroach Labs and Red Hat of more than 200 IT professionals found that 59% of them run both stateless and stateful applications on their Kubernetes-based systems.
Innovations like StatefulSets made running stateful workloads on K8s easier — but many teams still struggle, according to the survey. Forty-six percent of respondents named stateful workloads as their primary challenge in effectively architecting for and deploying on Kubernetes, making it the most commonly cited concern.
Walker, however, is optimistic that continued innovation will make life easier for developers who need to run data-intensive applications on K8s: “As the Kubernetes community has matured, I think things are gonna become a lot easier … we need to simplify all this stuff.”
Which does not mean, our guests said, one database solution to rule them all. Walker dismissed the idea of consolidation, calling it “preposterous.”
“Yes, logically, maybe it makes sense to put everything into a single data lake and deal with it. Physically, it makes no sense,” he said. “Think about an analytical query that’s going to be running across multiple, different regions across the entire planet — do you really want to make that happen? You’re going to get killed on the egress cost alone between the different cloud providers. What if you did want to run that thing across different cloud providers?
Added Gienow, “The whole idea of consolidation is directly counter to the whole notion of cloud native: small autonomous teams running small loosely coupled workloads. Having Kubernetes means that I might need Postgres for my microservice. And you might want MongoDB and Redis. Some of our use cases require performance. Some we just need for 20 minutes and they can just evaporate.
“Nothing’s going to do all of these things, ideally, well, and so we do have the capability to apply the right tools to the right jobs, and then just have this data mesh layer that just handles it for us.”