Pancake Podcast: Cassandra and the Kubernetes Data Plane
What is the role that the data plane plays in a Kubernetes ecosystem? This was the theme for our latest (virtual) pancake breakfast and panel discussion, sponsored by DataStax, the keeper of the open source Cassandra database, captured in this latest episode of The New Stack Analysts podcast.
Last month, Datastax released a Kubernetes operator, so that the NoSQL database can be more easily installed, managed, and updated in Kubernetes container-based infrastructure.
The Panelists for this discussion:
- Kathryn Erickson, DataStax senior director of partnerships.
- Janakiram MSV, principal analyst of Janakiram & Associates.
- Aaron Ploetz, Target NoSQL lead engineer.
- Sam Ramji, DataStax chief strategy officer.
Alex Williams, publisher for The New Stack served as moderator for this panel, with the help of TNS managing editor Joab Jackson.
in 2015, Ramji worked at Google and oversaw the business development around its then-newly open source project, Kubernetes, which was based on its internal container orchestrator, the Borg. The Borg provides Google a single control pane for dynamically managing all its many containerized workloads, and its scale-out database, Spanner, offered the same for the data plane.
“The marriage of those two things made compute and data so universally addressable so easy to access that you could do just about anything that you could imagine,” Ramji explained.
While Kubernetes the Borg-like universal control plane for its users, the equivalent data plane has not been universally established. But it is a role that Cassandra could easily fill.
“Cassandra has been around for over a decade, and it’s one of the proven battle-tested distributed databases. A lot of customers use it for structured, unstructured data. And the unique ability of Cassandra is really scaling out,” MSV pointed out.
This scalability is really, really important too, as Ploetz pointed out.
“I remember the days when scaling out a cluster meant submitting like a request so that we could go buy a server from Dell or several servers from Dell — and having to go and defend that decision. Large enterprises move at a glaciated scale,” Poletz said. “Using something like Kubernetes allows you to just say, ‘I need three more nodes.’ OK, boom, boom, boom. And there you go. It’s that agility, it’s that ability to scale quickly that that’s the big benefit there.”
This is why the operator was the final piece of the puzzle, according to Erickson, because it gives Kubernetes an instruction set for scaling up and scaling down the database according to its needs.
“When you add a Cassandra node, it looks for a seed and it says I’m ready to join the cluster on Kubernetes. You’re using DNS names. So what we do with Management sidecar is when we want to add a node, we just resolve the hostname, the IP address of a seed and just make sure that we’re orchestrating these things to happen in the right order. So we’re okay if the IP addresses change we just do a quick check upon startup to make sure that we know that,” she said.