With an eye towards serving internet growth companies with growing global customer bases, the New York-based Cockroach Labs has launched a managed service for its CockroachDB distributed SQL database, CockroachCloud.
The service tackles the difficulty of running a global database while maintaining the low-latencies needed to serve end-users immediately, Cockroach Labs’ CEO and co-founder Spencer Kimball, in an interview with The New Stack.
Initially, CockroachCloud, now in beta release, is available on the Google Cloud Platform and Amazon Web services, through each cloud’s respective marketplace. When the company starts charging for the service, it will be based on the number of nodes used, but over time will move to usage-based pricing.
Distributed, continuously-replicating databases can certainly add resiliency to a system — you don’t want to keep all data in one location, on one provider or even in one region, in case a large-scale outage should happen. And the traditional method of asynchronous active/passive replication, which can lose data when something goes awry. But when serving a global user-base, simply replicating a data set, even continuously across a few different geographically distributed data centers is not sufficient, Kimball argued.
“In many cases, you don’t want to put a user’s data everywhere,” he said. Different countries have different data protection laws, for instance. More importantly, you want to put the data close to the user, to minimize latency. Someone in London whose data resides in the U.S., for instance, will experience long wait times as the data makes its way across the ocean.
“There is tension between all these things you have to balance: You want to spread the data out, but at the same time, you want to keep the data close to the user,” Kimball said.
Rather than simply replicating all data across all the nodes, CockroachDB does consensus geo-replication: Partitions are based on geographic locations. When a new user is added, they are added to the partition closest to their preferred location. The primary copy is kept at the closest instance of the database to the customer, and is only then replicated across other locations. If a user visits a far-off location, then their data can be transferred over to the nearest node there, Kimball explained.
“Cockroach is able to do all that transparently,” Kimball said. “We kept that complexity manageable.”
The new service itself is designed to take out all the management headaches of running distributed databases, giving users the ability to scale a distributed transactional database and continuously replicate it across the globe. Managing distributed applications such as databases is more inherently difficult, especially at a global scale.
“The amount of automation required to bring up a global cluster and make it work well is something we spent a year doing now. You wouldn’t want a company [that is not a service provider] to have to do the same thing. Their time would not be well-spent doing such a thing.”
In its research, Garter has found that cloud database systems are now the norm for most enterprise workloads, with in-house deployments are increasingly being relegated to “legacy status.” The firm estimated that the overall database management systems market grew at 18.4% from 2017 to 2018, with cloud DBMS’ accounting for 68% of that growth.
In the cloud space, Cockroach will be competing with Google’s own Spanner, though offers the ability to easily span a database across multiple cloud providers. Most all companies, because of differing requirements, or even from acquisitions, will inevitably become multicloud users, Kimball argued.
The company unveiled the new service Wednesday at its initial ESCAPE/19 multicloud conference, held in New York.