Kubernetes Probes (and Why They Matter for Autoscaling)
Validating the health of our workloads — applications running on Kubernetes — is crucial to their success. To manage workload health, we rely on telemetry information and diagnostics, which are often captured via system and application components and then sent to a monitoring tool. Once transferred, this data is shared with system administrators, DevOps teams and site reliability engineers (SREs) in the form of metrics that help determine where we must take action.
One method of collecting telemetry information is to use a probe. This process is a diagnostic check, whereby a health probe is sent from a load balancer to its defined endpoint, such as a web-server farm, to validate that the application is available and running. If there’s no response from the endpoint, the load balancer (in this case) will bypass the endpoint instead of sending the user to a potentially failing website. This means the probe has failed.
We can use Kubernetes probes to perform these kinds of checks within Kubernetes. Probes are performed periodically on our containers by the kubelet, the primary node agent that runs on every Kubernetes node server. Kubernetes probes allow us to validate the state of the pods running within our cluster. In addition to validating our workloads’ health and functionality, we can use Kubernetes probes to monitor and gather information about other events affecting containers, such as autoscaling.
This article will explain the different types of probes and the importance of each. We’ll discuss how they work and, in particular, how they support autoscaling. Then we’ll highlight why it’s essential to find the correct settings for probes and why experimentation is key to optimizing probe settings.
Using Kubernetes Probes Effectively
There are many factors that contribute to — and numerous benefits that are associated with — the effective use of Kubernetes probes. Let’s explore what Kubernetes probes are, highlight their benefits, and discuss ways to get the most out of them.
Different Kinds of Kubernetes Probes
Before we explore how to use Kubernetes probes efficiently, we must familiarize ourselves with the three kinds of Kubernetes probes: startup, readiness and liveness.
In a runtime sequence, the flow of probe usage is as follows:
The startup probe is the first to start and tells the kubelet that the application within the container has successfully started. The other two probes will be disabled until the startup probe is in a successful state.
One example where startup probes are helpful is when monitoring slow-starting containers. If the liveness probe monitored these containers instead, the containers might be terminated prematurely because none of the applications would appear to have successfully started.
The readiness probe informs the Kubernetes cluster that the container is ready to accept requests, such as one that allows a user to connect to a web application. If the readiness probe is in a failed state, no IP address is sent to the pod. Consequently, the pod is removed from the corresponding service.
The readiness probe can guarantee that the application running within the container is 100% ready to be used. There are, however, instances where the readiness probe can’t do that. Imagine, for example, the case of a deadlock, where the application process keeps running but isn’t servicing requests anymore. The readiness probe wouldn’t detect the unserviced requests because it assumes the application is running fine. This situation reflects a case where it’s essential to use readiness and liveness probes together.
You might also wonder if both liveness and readiness probes are always needed. The answer is that it depends on the nature of the containerized application. Since the readiness probe is mainly used to confirm that the container is ready to accept network requests, we could omit it if our application doesn’t rely on network requests, for example if it’s running an internal process within the container, which doesn’t require network interaction.
The liveness probe confirms whether the container is running. If the signal from the probe indicates a non-running status, the kubelet picks up this signal and kills the container process. Typically this will result in the container restarting, unless it’s configured differently.
But even if the liveness probe confirms that the container is running, it doesn’t guarantee that the container’s application is running. The pod might be ready, but that doesn’t mean the application can serve requests.
Imagine a web application presenting an HTTP 503 error page because it can’t connect to a backend database, which allows it to retrieve information. From the liveness probe’s perspective, the container is running because the web component runs as if the web page is live. However the application is not in a successful state because the web page can’t connect to the database.
Kubernetes Probes and Autoscaling
As mentioned, Kubernetes probes do more than help us understand our application’s health. They also support well-planned, effective autoscaling based on health metrics.
Horizontal pod autoscaling is achieved when pods are added automatically to support an expanding application workload, typically when there is an increased demand for CPU, memory or other key resources. In addition, horizontal pod autoscaling automatically stops and removes unnecessary pods whenever demand decreases. In a similar response to expanding or shrinking compute needs, vertical pod autoscaling is achieved when pods are reconfigured with larger or smaller resource amounts.
The correct use and configuration of startup, liveness and readiness probes is crucial to facilitate the completion of autoscaling events.
Probes aren’t required for Kubernetes autoscaling. However, their proper use can inform the autoscaling process and validate that affected containers have, in fact, started or shut down. This sequence would successfully complete the autoscaling event faster and more efficiently. The correct use and configuration of startup, liveness and readiness probes is crucial to facilitate the completion of these autoscaling events. Why? If the probe settings aren’t returning successful responses within a reasonable period of time, additional pods might be added or removed to meet the autoscaling needs when in fact they may not be needed once probes return successfully and mark the first set of pods as ready.
Defining Parameters and Probe Configuration
Kubernetes provides several parameters for startup, readiness and liveness probes that can be adjusted to fine-tune the probe configuration.
Each parameter has a default value, but specific situations in our environment may require that we use different parameter values. For example, a parameter default value gathered by the probe might not provide specific-enough information to understand why an application is slow starting. Alternatively, default values might capture and generate too much information, making it difficult to arrive at helpful conclusions.
We can configure all of the parameters outlined below, each of which is valid for all three Kubernetes probe types.
timeoutSeconds reflects the number of seconds after which the probe times out. The default is set at one second, meaning that the container has one second to respond to the probe request.
timeoutSeconds parameter on our container or application probe is set to one second as a default. In that case, a slower container might not have enough time to respond, which can result in the container’s termination. This parameter’s value may need to be increased (perhaps to several seconds) because we might have to wait longer than usual before we mark an unresponsive container as “failed.”
periodSeconds reflects how often (in seconds) to perform the probe check. As with the
timeoutSeconds parameter, it’s essential to work toward accurately configuring the
periodSeconds parameter. The process might saturate the application workload if the probe check occurs too frequently. However, if the check doesn’t happen frequently enough, we might not know that an application is failing.
failureThreshold reflects the number of failed requests or responses. The default threshold is three, meaning a container would be flagged as failed when it misses three consecutive probes, assuming the
periodSeconds are configured with their default values.
initialDelaySeconds reflects the time it takes for the probes to start signaling after the container has started successfully. The default is zero, meaning a probe would send signals immediately after a successful container startup.
In cases of slower-starting containers or applications, it might be preferable to set this delay to a higher value. Initially, Kubernetes only offered readiness and liveness probes. While this generally worked fine, there were situations where the probes generated errors because the application was not yet ready, even though the container was running fine. This was also why the startup probe was introduced: to validate that the container was starting up without immediately checking on the health of the application itself.
successThreshold reflects the number of positive probe signals required to ensure that the container is in a successful state. The default number is one, which means there must be at least one positive signal from the probe to designate the container status as successful. If we don’t want to rely on just a single pulse from the probe to confirm the healthy state of the container, we could change this value to a higher number.
The Value of Probe Experimentation
Given the complexity of managing and running Kubernetes containers, it can be challenging to determine the “absolutely correct” values for the aforementioned parameters. We must understand how fast our application starts up and how it behaves under load to decide what the various probe settings should be in order to meet SLAs or SLOs. The combination of parameters and settings — and their interactions with each other — have to be considered, which means that probe tuning isn’t an exact science.
Probe experimentation allows us to validate different parameter settings and understand how they might affect the behavior of the Kubernetes pods.
That’s where experimentation comes into the picture. Typically performed in a test environment, probe experimentation allows us to validate different parameter settings and understand how they might affect the behavior of the Kubernetes pods. It also helps us understand the overall health of our containers, applications and clusters.
By using the probe experimentation process to run multiple tests across different scenarios, we can improve the accuracy of our probe parameter settings.
Configuring Probes in Kubernetes
Probes and their corresponding parameters are all configured via a Kubernetes YAML file, similar to how other Kubernetes resources are deployed.
Here’s an example of a YAML file:
- name: probe-app
In this example, the container listens on port 8080 and has a
/health endpoint built into the web application that’s used for the health check. It helps to validate the application’s readiness and liveness state.
The readiness probe sends an HTTP GET request to this endpoint, with a 5 second initial delay and a 1 second timeout. The container is considered ready if the endpoint returns a successful response (HTTP 200) within the given time.
The liveness probe works similarly, but it’s used to check if the container is still running and responding to requests. In this example, it has a 15 second initial delay and a 1 second timeout. If the liveness probe fails, Kubernetes will restart the container to try and recover it.
Both the readiness and liveness probes help ensure that our application runs properly and can handle requests from other parts of our Kubernetes cluster.
The Value of Machine-Learning-Based Experimentation
There’s no golden rule when selecting the correct parameters or their values for these probe settings. We could start tuning and testing different probe values using a manual approach, validating the impact on container behavior and the autoscaling aspect of running containers based on the probes.
However, this manual approach would be a tedious, time consuming, and potentially expensive process. Also, it is likely that we could not collect enough data points to provide accurate configuration decisions within a reasonable amount of time.
The key to making these confident configuration decisions is to leverage automation and ML-based experimentation tooling to enable us to gather enough supporting data points about how the various configuration combinations affect and enable desired probe behavior. Tools such as StormForge Optimize Pro shorten the experimentation time by orders of magnitude while collecting many more data points that help us see how to converge on the behavior we want.
In this article, we explored the important role of health checks for validating containers and how Kubernetes probes enable us to perform these checks. They include the startup probe, used to validate the starting-up sequence of a container workload, and the readiness and liveness probes that periodically perform diagnostic tests to help us understand the health of our running containers and applications.
To effectively use Kubernetes probes, we need to experiment with different probe parameters and conduct multiple tests that consider real-world application or container environment conditions, including peak load, slow start and downsizing. The more accurately we can tune our probe parameters, the better our probe checks, autoscaling capabilities and overall Kubernetes application performance will be. However, manual experimentation is tedious and time consuming. With today’s complex application environments, leveraging automation and ML-based experimentation tools will help us find the right probe settings much faster and with greater accuracy.