DASH: Four Properties of Kubernetes-Native Databases
Cloud native application architectures help developers deliver amazing experiences to their customers around the world. They do this by taking advantage of billions in cloud provider investments, which provide nearly unlimited and on-demand resources spread across hundreds of data centers globally. Kubernetes — the Google-built open source container orchestration system — is quickly becoming a ubiquitous tool for deploying, running, and scaling these applications. Kubernetes simplifies and supercharges application delivery if (and this is a big “if”) those applications are architected to take advantage of the resources available in cloud environments.
The venerable Twelve-Factor App guide documented a good framework for thinking about cloud native architectures, but the outlined design patterns really only covered the portions of the application that are stateless, meaning their output depends only on their input; nothing that happens in between is stored. Unfortunately, the traditional relational databases that give life — or state — to any application were not architectured to take full advantage of the resources available in the cloud.
In the 12-Factor App, these databases were relegated to the nebulous “backing services” section, where they existed outside of the cloud-native guidelines. Likewise, while some Information Technology (IT) teams have ported databases to containers and orchestrated them to simplify deployment, they have not been able to use the full power of Kubernetes to take advantage of the resiliency and scale benefits you’d expect from a purpose-built cloud-native system.
To draw a parallel with stateless servers running in Kubernetes, if one database process goes down, the application should be none the wiser.
What if we designed a relational database that more closely resembled the horizontally scalable stateless application servers that Kubernetes was built to run? If we wanted to do that while keeping the data integrity guarantees of SQL, this would mean exploring the frontiers of computer science research to navigate around previously insurmountable implementation barriers — but the results would be worth it. Some of these next generation, “NewSQL” databases work natively with Kubernetes, and in doing so finally, bring end-to-end cloud-native stacks within reach of IT teams.
In this post, I’ll cover four factors that support Kubernetes-native operations for databases: Disposability, API symmetry, Shared-nothing, and Horizontal Scaling. Let’s call these factors DASH. To make these concepts concrete, I’ll use examples of my favorite DASH database, CockroachDB, but you should keep in mind that these concepts extend, at varying degrees, to other NewSQL databases.
Disposability: Losing Things Should Be a Non-Event
Disposability is the ability of a database to handle processes stopping, starting, or crashing with little-to-no notice. In the cloud, these disruptions can be caused by a variety of random — but inevitable — events including disk failures, network partitions, or entire virtual machines going offline. Disposability is important because the Kubernetes Scheduler uses a set of rules for determining which pods (that is, small groupings of containers that are always scheduled as a unit) run on which machines. Once pods are scheduled, they remain on those machines until some sort of disruption occurs due to voluntary (i.e. scaling in or upgrading) or involuntary (e.g. hardware failure or operating system kernel panic) factors.
When these disruptions occur, Kubernetes may reschedule pods to more suitable nodes; for databases that support Disposability, this rescheduling should be transparent to the client. Additionally, behind the scenes, the new and existing processes need to be able to dynamically reconfigure themselves as network identities change. To draw a parallel with stateless servers running in Kubernetes, if one database process goes down, the application should be none the wiser.
Disruptions like these are a significant problem for legacy relational databases because they typically have a single machine powering them at any given time. For production deployments, these databases may send updates asynchronously to a second instance that will take the lead in the event the primary machine goes down. However, in practice, the process of actually failing over is difficult to do well. Github recently described the perils of executing a MySQL failover during a major outage, which resulted in out of date and inconsistent data. If the systems powering legacy relational databases fail, users will likely know about it.
Both NewSQL and NoSQL databases typically have the Disposability property, as they were designed to thrive in ephemeral cloud environments where virtual machines could be restarted or rescheduled at a moment’s notice. Databases with this factor should be able to survive node failures with no data loss and no downtime. For example, CockroachDB is able to survive machine loss by maintaining three consistent replicas of any piece of data among the nodes in a cluster; writes are confirmed to clients as soon as the majority of replicas acknowledge the action. In the event that a machine containing a given replica is lost, CockroachDB can still serve consistent reads and writes, while simultaneously creating a third replica of that data elsewhere in the cluster to ensure it can survive future machine failures.
Keep in mind that disposability isn’t just about surviving individual machine failures. DASH databases should be able to extend the concept of disposability to entire data centers or even data center regions. This type of real-world failure should be a non-event; these capabilities would help avoid issues like the ones encountered by Wells Fargo, where “smoke in the data center” resulted in a global outage.
API Symmetry: Every Server Should Provide the Same Answer to the Same Question
Kubernetes uses Services to allow clients to address a group of identical processes as a whole through a convenient DNS entry. This way applications don’t need to know about the many instances that power a frontend; there could be one backing server, or there could be hundreds. Services are essentially defined as rules that say any pods with a given collection of labels should receive requests sent to this service (assuming their health checks and readiness probes say it is OK to send traffic).
Why does Kubernetes take this approach? By decoupling the pod from the address of the service with which it is associated, we can scale without disrupting existing application instances. This is possible because of API symmetry, meaning the pods included in the group all have the same API and provide consistent responses, regardless of which instance is chosen by the Kubernetes Service. For stateless services, this is simple: The logic in each service is the same, so sending a request to any service at a given time will always yield the same result. Note that for a database to have API symmetry, the underlying data must also be strongly consistent. If you get different answers depending on which node you are routed to, the abstraction provided by Kubernetes is broken, and that leads to bug-causing complexity for application developers.
This is actually an area where traditional relational databases perform better than NoSQL systems. Though this might initially seem counterintuitive, this is because when a database has a single node, its API is symmetric (albeit at the cost of disposability) — in other words, you’ll always get the same result when sending queries to the service. However, when asynchronous replication comes into play (either for high availability or to support performance improvements like read replicas), the API symmetry is violated, since the master becomes the source of truth where the replicas would be slightly out of sync.
In the NewSQL world, techniques like consensus replication allow databases to provide API symmetry without sacrificing disposability. Going back to our CockroachDB example: any CockroachDB node can handle any request. If the node that receives a load-balanced request happens to have the data it needs locally, it will respond immediately. If not, rather than sending an error back to the client (or providing a stale result), that node will become a gateway and will forward the request to the appropriate nodes behind the scenes to get the correct answer. From the requester’s point of view, every node is exactly the same. When you combine database API symmetry with Kubernetes Service objects, you can create what is effectively a single logical database with the consistency guarantees of a single-machine system, despite there actually being dozens or even hundreds of nodes working behind the scenes.
Shared-Nothing: Eliminate Single Points of Failure
The cloud shouldn’t have a maintenance window — true cloud-native services should have the capability to be always on. In the Disposability section, we talked about the option Kubernetes offers to reschedule nodes at a moment’s notice in the event of a planned or unplanned disruption event. Shared-nothing databases take this a step further. This property says that a database should be able to operate without any centralized coordinator or single point of failure. In the stateless world, this goes hand in hand with disposability; when state is involved, this concept becomes an additional consideration.
Relational databases are notorious for having single points of failure. This extends even to modern RDBMS systems, like Amazon Aurora. Even some NewSQL databases rely on special coordinators to keep track of all the bookkeeping required to build a globally-distributed system. What this means is that you can have architectures that can survive certain workers being disposed, but if you take down the coordinator process, if the entire system doesn’t go offline in many cases the configuration state will be frozen and certain types of critical operations — potentially those required to debug the issue — will fail.
Cloud native databases should be able to survive in a world where any node can fail, not just “any node except for our special master node that coordinates everything.” For example, CockroachDB has no master process — this is what gives it its eponymous survivability characteristics. Although each CockroachDB database server is stateful, it only relies on the state for which it is responsible (though it does cache some knowledge that it has gleaned about the cluster through communicating with other nodes via a peer-to-peer network). CockroachDB nodes do not rely on any authoritative source to say what they should be doing at any given point in time. Shared-nothing architectures for stateful systems allow both ultra-high availability and ease of operations.
Interestingly enough, Kubernetes itself is not a shared-nothing system. It has a single-region control plane that, if destroyed, will compromise the cluster. Operators can create NewSQL database clusters that survive more than Kubernetes’ control plane nodes by spanning Kubernetes clusters across regions or even cloud providers.
Horizontal Scaling: Scale-Out Rather Than Up
The last factor is horizontal scalability. Similar to the way this term was used in the 12-Factor App, this means if you want more throughput, you simply add more processes. Kubernetes Controllers and Schedulers combine to make horizontal scaling an easy, declarative process: You simply say how many instances you want, and the Kubernetes system diligently adds pods to meet your request. When combined with the Services described above, this creates the impression of a single database that magically got twice as powerful but with costs that scale linearly rather than exponentially.
While horizontal scaling is a fundamental benefit of many NoSQL systems, traditional relational databases do not do this well. They rely instead on sharding — with or without the help of systems like Vitess — to accomplish this use case. What this means for traditional RDBMS systems in cloud-native environments is that if you need more power, you have to buy a more expensive machine and incur downtime, or you have to dramatically increase your operational overhead by splitting your database into many pieces that cannot easily talk to each other. For teams that rely on vertical scaling, this means there is a natural limit to how powerful a relational database can get — at the end of the day, it is limited to what can be powered by a single server.
NewSQL databases take a page from NoSQL systems and scale by adding more machines. For example, a single Kubernetes command can scale out CockroachDB by provisioning new resources and spinning up additional pods with no downtime; the Kubernetes load balancer will recognize the new database capacity and automatically start routing requests among the new instances. Each node can independently process requests while also taking part in helping other nodes when it comes to completing tasks like processing complex queries by breaking them up into smaller bits of work that can be completed in parallel. Kubernetes has features to help scale out stateful services by doing things like providing pods with predictable network identities to facilitate service discovery among the cluster instances.
A Bridge to the Cloud for Relational Databases
DASH properties are the necessary preconditions to having a truly cloud native architecture.
- Disposability ensures your stateful systems can survive when ephemeral cloud resources cease to exist.
- API symmetry allows distributed databases to always provide the up-to-date answer, no matter which process is handling the client request.
- Shared nothing properties enable your database to make forward progress without any centralized master or coordinator.
- Horizontal scalability allows the relational database to take advantage of the unlimited and on-demand resources available in the cloud.
When coupled with Kubernetes, DASH databases give IT teams an automated relational database that operates as an always-on, elastic data layer that adds the missing cloud native foundation to their stacks.