About Etcd, the Distributed Key-Value Store Used For Kubernetes, Google’s Cluster Container Manager
I have spent many years developing distributed systems. Before my time here at CoreOS, I was one of the first hires at Heroku where I worked on research, development and distributed systems engineering. In 2011, with my Heroku collegue Keith Rarick, we wrote Doozer, an inspiration for etcd.
Etcd is an open-source distributed key-value store that serves as the backbone of distributed systems by providing a canonical hub for cluster coordination and state management – the systems source of truth. While etcd was built specifically for clusters running CoreOS, etcd works on a variety of operating systems including OS X, Linux, and BSD.
Today there is a lot of news about Kubernetes, the open source container cluster manager from Google, which happens to be built on top of etcd. Kubernetes leverages the etcd distributed key-value store. It takes care of storing and replicating data used by Kubernetes across the entire cluster, and thanks to the Raft consensus algorithm, etcd can recover from hardware failure and network partitions. In addition to Kubernetes, Cloud Foundry also uses etcd as their distributed key-value store.
Clusters are usually built from a large collection of machines with the ability to run any workload at any given time. In order for a cluster to perform at high levels of efficiency, we need to distribute workloads appropriately across all machines in the cluster. Then clusters need a way of coordinating with each other.
For example a job scheduler needs to notify a machine that it has work to do. Once that work has been completed machines may need to communicate that fact to some other component in the cluster. A distributed system needs a reliable coordination mechanism, so it’s important that this communication happen in a timely and reliable manner to keep everything running smoothly. In essence something has to manage the state of the cluster – the source of truth.
This is where etcd comes in.
Managing cluster state across any distributed system is where the hard problems live. Clusters often suffer from network partitions and nasty race conditions that need to be resolved somewhere. We avoid handling this complexity in our applications or at the individual machine level because maintenance would become a nightmare, so we handle it higher up the stack, where etcd can really make a difference.
Etcd is written in Go and uses the Raft protocol. Raft is a protocol for multiple nodes to maintain identical logs of state changing commands, and any node in a raft node may be treated as the master, and it will coordinate with the others to agree on which order state changes happen in.
Supporting Distributed Systems
Etcd was designed to be the backbone of any distributed system which is why projects like Google Kubernetes, Cloud Foundry and Fleet, rely on etcd. With etcd you can easily manage cluster coordination and state management. If you want to try Kubernetes out on CoreOS, head over to our blog to see an example.