Cloud Native / Data / Kubernetes

CockroachDB Distributed Database Prepped for Kubernetes Deployments

12 Nov 2020 3:00am, by

The CockroachDB cloud native, distributed SQL database may be coming to a Kubernetes cluster near you. The recently released version 20.2 packages the database system for Kubernetes with a custom Kubernetes Operator, making it, in Cockroach Labs‘ words, “the only database architected from the ground up to deliver on the core distributed principles of Kubernetes.”

This latest version also offers features around spatial data, stronger Java and Ruby support, and general overall improved performance.

CockroachDB brings a base of cloud native origins that underly everything, having arisen from the same ideas as those behind Google’s Spanner database, which arose alongside Google’s Kubernetes’ ancestor Borg, noted Vice President of Product Marketing at Cockroach Labs Jim Walker.

“There’s very few kinds of databases that were built to be distributed. There’s kind of this impedance mismatch of distributed systems and people trying to deploy legacy databases on these kinds of orchestrated distributed environments. Legacy is just simply not built for this world,” said Walker. “Transactions become extremely complex when you have multiple nodes, they weren’t built to scale, and by the way, pods are ephemeral, so they need to be able to automatically survive things.”

Even distributed databases, explained Walker, may not be “built directly for pods and orchestrated environments such as Kubernetes” and present a high level of complexity for those looking to use it in that way. He offers this as context for the new Kubernetes Operator, which is meant to simplify deployment configuration and enable no-downtime rolling upgrades in production, explaining that the Operator instead enables extra functionality, not just the ability to install the database.

“Most databases need to build an operator because they aren’t a direct fit for Kubernetes. A traditional legacy database needs to figure out how to actually deploy with certificates. How do I manage state? How do I survive failure of a node? If I’m running mySQL and a shard dies, how do I get that back? That’s difficult to do and people will typically need to do operators to build out the core capability of resilience and scale,” said Walker. “We don’t need that, actually. We scale naturally with Kubernetes. Our operator is helping with deployment, some of the security configurations, helping with the size of a cluster, or helping with the management of it.”

The addition of geo-spatial data types and libraries to the open source version of CockroachDB, meanwhile, makes it “the first cloud native distributed database to include these capabilities,” according to a company statement, and Walker says that that, alongside the geo-distributed nature of CockroachDB itself, should offer the potential for innovation.

“Spatial data is unique. It’s not simple stuff,” he said, noting that they had re-architected and rebuilt everything in Go to make the change.

As for performance, the company says that the latest version improves its TPC-C Benchmark performance by 40% over last year, now handling up to 140k warehouses with a maximum throughput of 1.7 million transactions-per-minute (tpmC).

Among the other new features with CockroachDB 20.2, including a new storage engine called Pebble, security improvements to logging, Role-Based Access Control (RBAC), and certificate management, and support for numerous Java and Ruby tools, Walker emphasized Materialized Views as a feature of note, which offers a way to speed up particular queries. This latest version also brings some of the basic distributed Backup and Restore capabilities that had previously been reserved for the Enterprise version to CockroachDB Core, the free and open source version of CockroachDB.

Cockroach Labs is a sponsor of The New Stack.

A newsletter digest of the week’s most important stories & analyses.