How Does Service Discovery Work in Kubernetes?

The services model in Kubernetes provides the most basic, but most important, aspect of microservices: discovery. Understanding service discovery is key to understanding how an application runs on Kubernetes.
The previous article in this series covered the basics of nodes and pods. As a reminder, a node is the workhorse of the Kubernetes cluster, responsible for running containerized workloads; additional components of logging, monitoring and service discovery; and optional add-ons. While pods are the smallest and simplest unit in the Kubernetes object model and the smallest schedulable item in a Kubernetes application.
Any API object in Kubernetes, including a node or a pod, may have key-value pairs associated with it — additional metadata for identifying and grouping objects sharing a common attribute or property. Kubernetes refers to these key-value pairs as labels and annotations. Service discovery takes advantage of the labels and selectors to associate a service with a set of pods.
A single pod or a ReplicaSet may be exposed to internal or external clients via services, which associate a set of pods with a specific criterion. Any pod whose labels match the selector defined in the service manifest will automatically be discovered by the service. This architecture provides a flexible, loosely-coupled mechanism for service discovery.
When a pod is created, it is assigned an IP address accessible only within the cluster. But there is no guarantee that the pod’s IP address will remain the same throughout its life cycle. Kubernetes may relocate or re-instantiate pods at runtime, resulting in a new IP address for the pod.
To compensate for this uncertainty, services ensure that traffic is always routed to the appropriate pod within the cluster, regardless of the node on which it is scheduled. Each service exposes an IP address, and may also expose a DNS endpoint — both of which will never change. Internal or external consumers that need to communicate with a set of pods will use the service’s IP address, or its more generally known DNS endpoint. In this way, the service acts as the glue for connecting pods with other pods.
A deployment relies upon labels and selectors for determining which pods will participate in a scaling operation. Any pod whose label matches the selector defined by the service will be exposed at its endpoint. A service then provides basic load balancing by routing traffic across matching pods.
A selector is a kind of criterion used to query Kubernetes objects that match a label value. This powerful technique enables loose coupling of objects. New objects may be generated whose labels match the selectors’ value. Labels and selectors form the primary grouping mechanism in Kubernetes for identifying components to which an operation applies.
At runtime, pods may be scaled by means of ReplicaSets, ensuring that every deployment always runs the desired number of pods. Each ReplicaSet maintains a predefined set of pods at all times. When a scaling operation is initiated by a deployment, new pods created by that operation will instantly begin receiving traffic.
Kubernetes provides three schemes to expose services:
- ClusterIP: Meant for pods to communicate with each other within the cluster. For example, a database pod exposed through a ClusterIP-based service becomes available to the webserver pods.
- NodePort: Used to expose a service on the same port across all the nodes of a cluster. An internal routing mechanism ensures that the request is forwarded to the appropriate pods on each node. This is typically used for services with external consumers.
- LoadBalancer: The type LoadBalancer extends the NodePort service by adding Layer 4 (L4) and Layer 7 (L7) load balancers. This scheme is often used with clusters running in public cloud environments that support automated provisioning of software-defined load balancers.
When multiple services need to share the same load balancer or an external endpoint, an ingress controller is recommended. It manages external access to the services in a cluster — typically HTTP — by providing load balancing, secure socket layer (SSL) termination and name-based virtual hosting.
Ingress is becoming increasingly popular for running production workloads in Kubernetes. It lets multiple microservices of the same application use the same endpoint which is exposed by a load balancer, API gateway or an application delivery controller (ADC).
Our next article in this series covers networking and storage in Kubernetes.