Kube-Node: Let Your Kubernetes Cluster Auto-Manage Its Nodes
As Michelle Noorali put it in her keynote address at KubeCon Europe in March of this year: the Kubernetes open source container orchestration engine is still hard for developers. In theory, developers are crazy about Kubernetes and container technologies, because they let them write their application once and then run it anywhere without having to worry about the underlying infrastructure. In reality, however, they still rely on operations in many aspects, which (understandably) dampens their enthusiasm about the disruptive potential of these technologies.
One major downside for developers is that Kubernetes is not able to auto-manage and auto-scale its own machines. As a consequence, operations must get involved every time a worker node is deployed or deleted. Obviously, there are many node deployment solutions, including Terraform, Chef or Puppet, that make ops live much easier. However, all of them require domain-specific knowledge; a generic approach across various platforms that would not require ops intervention does not exist.
The main problem with Kubernetes node deployment and management is that there is no Kubernetes-native alignment of the node (or machine) lifecycle with the general Kubernetes resource lifecycle. With the current node lifecycle, node resources get created after a machine has joined the cluster. The general Kubernetes lifecycle, however, works exactly the other way around: First, come the node resources and then, the machine is added.
An unsatisfying status-quo that we figured needs to be challenged. To solve this issue and enable generic node management, we decided to launch kube-node as a community project. Our objective is to develop a native node integration for Kubernetes, similar to the PersistentVolumes system.
The PersistentVolumes system abstracts information of how storage is provided from how it is used and provides a higher-level API that is isolated from any cloud environment. This abstraction makes volumes’ lifetime independent from their consumers, so they can be dynamically allocated and managed.
How We Set-Up Kube-Node
Similar to the setup of PersistentVolumes, kube-node is an abstracted higher-level system where:
- Admins define configurations
- Devs can scale clusters with a simple kubectl create node -f node1.yaml
- Kubernetes controls the lifecycle of a node
To do this, we introduce two new API resources, NodeClass and NodeSet, complemented by provisioning instances: the NodeController and the NodeSetController.
- NodeSet ensures that a specified number of nodes is running at any one time. Very similar to the ReplicaSet, a NodeSet makes sure that a node or a homogeneous set of nodes is always up and available. Each NodeSet refers to a NodeClass, where the details are described in node templates.
- NodeClass gives administrators the possibility to set configurations for the nodes they offer. A NodeClass contains cloud provider and OS specific details including (possibly) credentials, machine type as well as provisioning data. Additionally, the cluster administrator might determine quality-of-service levels or arbitrary policies. Kubernetes itself is unopinionated about what a NodeClass represents.
- NodeSetController watches NodeSets and is responsible for creating and deleting nodes. In its current reference implementation this either means that it will create node resources or it will synchronize with the Google Container Engine (GKE) NodePools.
- NodeController watches for node objects and provisions the machine at the cloud provider. After the machine joins the cluster, the kubelet updates the node resource. Likewise, the NodeController deletes the machine at the provider when the node object is deleted.
Summary and Roadmap
kube-node is a community project to enable generic node management for Kubernetes. The objective is to provide developers with a simple way to scale clusters without operations intervention and regardless of the underlying infrastructure. It includes CustomResourceDefinition based types, client-go based clients and reference implementations of NodeController and NodeSetController.
The first reference implementation of the NodeSetController works together with NodeController, the second one integrates NodeSets with GKE NodePool. Both show the flexibility of the concepts to cover different use-cases. The first reference implementation of NodeController kube-machine reuses parts of the library of docker-machine to launch machines for many different cloud providers.
In a next step, we plan to integrate NodeSet with the Kubernetes auto-scaler to enable the NodeController to automatically create node objects based on the specified replicas in the NodeSet — controlled by the load on the cluster. With this integration, Kubernetes will be able to generically auto-scale its nodes on different platforms. In addition to that, we will set up more implementations of the NodeController, including for example Terraform, Python or an Amazon Web Services implementation which utilizes instance groups.
At Kubecon 2017, Loodse’s Sebastian Scheele, along with SysEleven’s Simon Pearce will discuss how to create high-availability clusters by running Kubernetes on Kubernetes.