What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.

Scale Applications in Kubernetes with Kubectl and the Horizontal Pod Autoscaler

A tutorial on scaling applications and services in Kubernetes with Kubectl and the Horizontal Pod Autoscaler.
Oct 22nd, 2019 8:44am by
Featued image for: Scale Applications in Kubernetes with Kubectl and the Horizontal Pod Autoscaler

Matt Zand
Matt Zand is the founder of High School Technology Services, DC Web Makers and Coding Bootcamps. He has written extensively on advanced topics on web design, mobile app development and blockchain. He is a senior editor at Touchstone Words where he writes and reviews coding and technology articles. He is also a senior instructor and developer living in Washington DC.

Nowadays, Kubernetes has transformed the way software development is done. As a portable, extensible, open-source platform for managing containerized workloads and services that facilitates both declarative configuration and automation, Kubernetes has proven itself to be a dominant player for managing complex microservices. Its popularity stems from the fact that Kubernetes meets the following needs: businesses want to grow and pay less, DevOps want a stable platform that can run applications at scale, developers want reliable and reproducible flows to write, test and debug code. Here is a good article to learn more about Kubernetes evolution and architecture.

However, have you given some thoughts on how to get such a powerful container orchestration platform while using resources that you actually need? The key to optimal resource utilization is to know what and when an application needs to be scaled up. Thus, in this article, we discuss and learn the most popular ways for scaling Kubernetes containers. In particular, in this article, we will focus on two services: Kubectl and the Horizontal Pod Autoscaler.

I. Kubectl in Kubernetes

The mechanism for interacting with Kubernetes on a daily basis is typically through a command-line tool called kubectl. Kubectl is primarily used to communicate with Kubernetes API servers to create, update, delete workloads within Kubernetes. Here we provide an overview of some of the common commands that you can utilize as a good starting point in managing Kubernetes.

The majority of common kubectl commands provide a specific operation or action to perform, like create, delete, etc. This method usually involves interpreting a file (either YAML or JSON) that describes the object within Kubernetes (a POD, Service, resource, etc). These files are used as templates, as well as ongoing documentation of the environment, and help retain Kubernetes’ focus on the declarative configuration. The operations given on the command line are passed to the API Server which, in turn, communicates with the backend services within Kubernetes as necessary. To install kubectl follow this following table:



Linux stable=$(curl -s
curl -LO${stable}/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
macOS brew install kubernetes-cli

Note: The best version of kubectl for Windows will change over time as new versions are released. To find the best current binary, follow this link ( and adjust the above URL as necessary.

Kubectl Syntax

Kubectl has a syntax to use as follows:

  • Command: Refers to what you want to perform (create, delete, etc.)
  • Type: refers to the resource type you are performing a command against (Pod, Service, etc.)
  • Name: The case-sensitive name of the object. If you don’t specify a name, it is possible to get information about all of the resources your command matches (Pods, for example)
  • Flags: These are optional but are useful when looking for specific resources. For example, –namespace allows you to specify a particular namespace to perform an operation in

Kubectl Operations

Use the following set of examples to help you familiarize yourself with running the commonly used kubectl operations:

These are the common operations used in kubectl. For more details, you can go through the official guide of kubectl. Here is a good article for reading more on similar Kubernetes topics.

II. Horizontal Pod Autoscaler

The Horizontal Pod Autoscaler (HPA) is a Kubernetes feature that allows you to configure your cluster to automatically scale the services it is running up or down. The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by user.

The HPA is implemented as a control loop, with a period controlled by the controller manager’s –horizontal-pod-autoscaler-sync-period flag (with a default value of 30 seconds).

During each period, the controller manager queries the resource utilization against the metrics specified in each HPA definition. The controller manager obtains the metrics from either the resource metrics API (for per-pod resource metrics), or the custom metrics API (for all other metrics).

  • For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each pod targeted by the HPA. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each pod. If a target raw value is set, the raw metric values are used directly. The controller then takes the mean of the utilization or the raw value (depending on the type of target specified) across all targeted pods, and produces a ratio used to scale the number of desired replicas.
  • For per-pod custom metrics, the controller functions similarly to per-pod resource metrics, except that it works with raw values, not utilization values.
  • For object metrics, a single metric is fetched (which describes the object in question), and compared to the target value, to produce a ratio as above.

The HorizontalPodAutoscaler controller can fetch metrics in two different ways: direct Heapster access, and REST client access. When using direct Heapster access, the HorizontalPodAutoscaler queries Heapster directly through the API server’s service proxy subresource. Heapster needs to be deployed on the cluster and running in the kube-system namespace.

HPA follows the following four steps as its workflow as shown in the follow graph:

  1. HPA continuously checks metrics values you configure during setup AT A DEFAULT 30 SEC intervals
  2. HPA attempts to increase the number of pods If the SPECIFIED threshold is met
  3. HPA mainly updates the number of replicas inside the deployment or replication controller
  4. The Deployment/Replication Controller WOULD THEN roll-out ANY additional needed pods

Consider these as you rollout HPA:

  • The default HPA check interval is 30 seconds. This can be configured through the — horizontal-pod-autoscaler-sync-period flag of the controller manager.
  • Default HPA relative metrics tolerance is 10%.
  • HPA waits for three minutes after the last scale-up events to allow metrics to stabilize. This can also be configured through — horizontal-pod-autoscaler-upscale-delay flag.
  • HPA waits for five minutes from the last scale-down event to avoid autoscaler thrashing. Configurable through — horizontal-pod-autoscaler-downscale-delay flag.
  • HPA works best with deployment objects or Pod metrics as opposed to replication controllers. Does not work with a rolling update using direct manipulation of replication controllers. It depends on the deployment object to manage the size of underlying replica sets when you do a deployment.
  • While using HPA with custom metrics such as Pod metrics or object metrics, to determine if it is time to scale up or down, you can use multiple metrics simultaneously as Kubernetes supports multiple metrics. Likewise, bear in mind that Kubernetes will consider each metric sequentially. Check out https:/​/​kubernetes.​io/​docs/​tasks/ run-​application/​horizontal-​pod-​autoscale for more examples.


In this article, we discussed two main tools for scaling Kubernetes applications, both of which are the key components of all Kubernetes services. We have seen how to install and use the different features like apply, get, delete, describe, logs of kubectl. Also, we reviewed and learned about Horizontal Pod Autoscaler like how does it work and how vital it is for any Kubernetes services. Both kubectl and HPA are essential features of Kubernetes when it comes to scaling a microservice application.

From here, you can move on to learn other Kubernetes topics such as updating live containers with a rolling update, working with configuration files, moving monolithic to microservices, integrating with Jenkins, working with the private Docker registry, or setting up and building the continuous delivery pipeline. Here is a good article for learning more advanced topics on Kubernetes development.

Feature image via Pixabay.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Docker.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.