A node is the workhorse of the Kubernetes cluster, responsible for running containerized workloads; additional components of logging, monitoring and service discovery; and optional add-ons. Its purpose is to expose compute, networking and storage resources to applications.
As we explained in our previous article on Kubernetes architecture, head nodes typically run the control plane responsible for scheduling and managing the life cycle of workloads. The worker nodes run applications. The collection of head nodes and worker nodes becomes a cluster.
Each Kubernetes node includes a container runtime, such as Docker, plus an agent (kubelet) that communicates with the head. A node may be a virtual machine (VM) running in a cloud or a bare metal server inside the data center.
The container runtime is responsible for managing the life cycle of each container running in the node. After a pod is scheduled on the node, the runtime pulls the images specified by the pod from the registry. When a pod is terminated, the runtime kills the containers that belong to the pod. Kubernetes may communicate with any Open Container Initiative (OCI)-compliant container runtime, including Docker and CRI-O.
The OCI is a standard that defines the runtime specification and image specification, with a goal to drive standardization of container runtimes and image formats.
The Kubelet is the Kubernetes agent whose responsibility is to interact with the container runtime to perform operations such as starting, stopping and maintaining containers.
Each kubelet also monitors the state of the pods. When a pod does not meet the desired state as defined by the deployment, it may be restarted on the same node. The node’s status is transmitted to the head every few seconds via heartbeat messages. If the head detects a node failure, the replication controller observes this state change and schedules the pods on other healthy nodes.
The kube-proxy component is implemented as a network proxy and a load balancer that orchestrates the network to route requests to the appropriate pods. It routes traffic to the appropriate pod based on the associated service name and the port number of an incoming request. It also takes advantage of OS-specific networking capabilities by manipulating the policies and rules defined through iptables. Each kube-proxy component may be integrated with network layers such as Calico and Flannel.
The orchestrator makes frequent use of logging as a means for gathering resource usage and performance metrics for containers on each node, such as CPU, memory, file and network usage. The Cloud Native Computing Foundation hosts a software component that provides a unified logging layer for use with Kubernetes or other orchestrators, called Fluentd. This component generates metrics that the Kubernetes head controller needs in order to keep track of available cluster resources, as well as the health of the overall infrastructure.
Kubernetes supports additional services in the form of add-ons. These optional services, such as the dashboard, are deployed like other applications, but are integrated with other core components on the node, such as the logging layer and kube-proxy. For example, the dashboard add-on pulls the metrics from the kubelet to display rich visualizations of resource utilization. The DNS add-on, based on kube-dns or CoreDNS, augments kube-proxy through name resolution.
Workloads are Containerized Applications
While the control plane and the worker nodes form the core cluster infrastructure, the workloads are the containerized applications deployed in Kubernetes.
After developing and testing a microservice, the developers package it as a container, which is the smallest unit of deployment packaged as a pod. A set of containers belonging to the same application is grouped, packaged, deployed and managed within Kubernetes.
Kubernetes exposes primitives for deployment, while constantly scaling, discovering, and monitoring the health of these microservices. Namespaces are typically used to logically separate one application from the other. They act as a logical cluster by providing a well-defined boundary and scope for all resources and services belonging to an application.
Within a namespace, the following Kubernetes primitives are deployed:
A pod is the basic execution unit of a Kubernetes application. It is the smallest and simplest unit in the Kubernetes object model. A pod is also the smallest schedulable item in a Kubernetes application. If Kubernetes is an operating system , a pod represents a set of processes — where each process may be mapped to a container — running on the cluster.
The pod serves as the core unit of workload management for Kubernetes, acting as the logical boundary for containers sharing the same execution context and resources. Grouping related containers into pods makes up for the configurational challenges introduced when containerization replaced first-generation virtualization, by making it possible to run multiple dependent processes together.
Each pod is a collection of one or more containers that use interprocess communication (IPC) for communication, and that may share the storage and networking stack. In scenarios where containers need to be coupled and co-located — for instance, a web server container and a cache container — they may easily be packaged in a single pod. A pod may be scaled out either manually, or through a policy defined by a feature called Horizontal Pod Autoscaling (HPA). Through this method, the number of pods that are a part of the deployment is increased proportionally based on available resources.
Pods enable a functional separation between development and deployment. While developers focus on their code, operators can concentrate on the broader picture of deciding which related containers may be stitched together into a functional unit. The result is the optimal amount of portability, since a pod is just a manifest of one or more container images managed together.
In Kubernetes, controllers augment pods by adding additional capabilities, such as desired configuration state and runtime characteristics.
A deployment brings declarative updates to pods. It guarantees that the desired state is always maintained by tracking the health of the pods participating in the deployment. Each deployment manages a ReplicaSet, which maintains a stable set of replica pods running at any given time, as defined by the desired state.
Deployments bring PaaS-like capabilities to pods through scaling, deployment history and rollback features. When a deployment is configured with a minimum replica count of two, Kubernetes ensures that at least two pods are always running, which brings fault tolerance. Even when deploying the pod with just one replica, it is highly recommended to use a deployment controller instead of a plain vanilla pod specification.
A statefulset is similar to a deployment, but is meant for pods that need persistence and a well-defined identifier and guaranteed order of creation. For workloads such as database clusters, a statefulset controller will create a highly available set of pods in a given order that have a predictable naming convention. Stateful workloads that need to be highly available, such as Cassandra, Kafka, ZooKeeper, and SQL Server, are deployed as statefulsets in Kubernetes.
To force a pod to run on every node of the cluster, a DaemonSet controller can be used. Since Kubernetes automatically schedules a DaemonSet in newly provisioned worker nodes, it becomes an ideal candidate to configure and prepare the nodes for the workload. For example, if an existing network file system (NFS) or Gluster file share has to be mounted on the nodes before deploying the workload, it is recommended to package and deploy the pod as a DaemonSet. Monitoring agents are good candidates to be used as a DaemonSet, to ensure that each node runs the monitoring agent.
For batch processing and scheduling jobs, pods can be packaged for a run-to-completion job or a cron job. A job creates one or more pods and ensures that a specified number of them successfully terminate. Pods configured for run to completion execute the job and exit, while a cron job will run a job based on the schedule defined in the crontab format.
Controllers define the life cycle of pods based on the workload characteristics and their execution context.
Now that we understand the basics of the Kubernetes control plane and how applications run on Kubernetes, it’s time to talk about service discovery to better understand how production workloads run in Kubernetes.
The Cloud Native Computing Foundation is a sponsor of The New Stack.