Logging for Kubernetes: What to Log and How to Log It
No matter what your Kubernetes environment looks like, logging and monitoring are among the first major challenges you’ll need to address as you begin your Kubernetes journey. Whether you have a single-node cluster running locally on your PC, a small collection of nodes that hosts a development environment, or a large-scale, multimaster cluster hosting production applications, it’s critical to be able to access monitoring data and logs in order to troubleshoot problems and optimize performance.
Given that Kubernetes consists of so many different parts, knowing where to look for that data and how to interpret it can be tricky. Indeed, Kubernetes logging and monitoring require working with multiple sources of data and multiple tools, because Kubernetes generates logs in multiple ways.
This article provides an overview of logging for Kubernetes. It covers which types of log data are available in Kubernetes and how to access that data. As we’ll see, there are multiple access methods available for most types of Kubernetes log data. This article explains how to assess your logging needs and devise a logging strategy suited to them.
Most of the information below applies to any type of Kubernetes environment. However, to ground the discussion, we’ll reference IBM Cloud Kubernetes Service (IKS) where relevant — to explain how the tools and practices we discuss can apply to a real-world production Kubernetes service.
What Is Kubernetes?
If you’re reading this article about Kubernetes logging, you probably already know what Kubernetes is. However, it’s worth briefly spelling out what it does, because understanding what Kubernetes can and can’t do is the first step in extending your logging strategy to support it.
Kubernetes offers several key types of functionality:
- Application hosting: Kubernetes’s first and foremost feature is hosting applications. Typically, those applications are hosted in containers, although it’s possible to run other types of workloads (such as virtual machines) with Kubernetes.
- Load balancing: Kubernetes automatically distributes traffic between different application instances to optimize performance and availability.
- Storage management: Kubernetes can manage access to storage pools that applications use to store stateful data.
- Self-healing: When something goes wrong, such as an application failure, Kubernetes attempts to fix it automatically. It doesn’t always succeed, however, which is one reason why Kubernetes logging is so important.
Kubernetes does other things too, but these are the core areas of functionality it offers.
Kubernetes and Logging: It’s Complicated
You’ll notice that logging and monitoring aren’t on the list of core Kubernetes features. That’s not because Kubernetes doesn’t offer any kind of logging and monitoring functionality. It does, but it’s complicated.
On the one hand, Kubernetes offers some very basic functions through kubectl for checking on the status of objects in a cluster, which we’ll discuss below. It also creates logs for certain types of data, and it exposes other types of data in ways that make it available for collection through third-party logging tools.
On the other hand, Kubernetes offers no full-fledged, native logging solution. Unlike, say, Amazon Web Services, which has a built-in logging solution in the form of CloudWatch; or OpenStack, which has its own comprehensive logging solution; stock Kubernetes doesn’t have a complete native logging service, or even a preferred third-party logging method. Instead, it expects you to use external tools to collect and interpret log data.
That said, certain Kubernetes distributions do come with built-in logging extensions based on third-party tooling, or at least a preferred logging method that they support. For example, as we’ll see below, IBM Cloud Kubernetes Service (IKS) integrates with IBM Log Analysis with LogDNA to collect Kubernetes log data, and enable real-time analysis and log management using LogDNA.
In most cases, it’s possible to use an alternative logging method, even on a Kubernetes distribution that has a preferred or natively integrated logging solution. However, the vendor-supported approach is usually simpler to implement.
What to Log in Kubernetes
No matter which logging option you choose, there are several log data types that you can collect in Kubernetes.
First and foremost are the logs from the applications that run on Kubernetes. The data stored in these logs consists of the information that your applications output as they run. Typically, this data is written to stdout inside the container where the application runs.
We’ll look at how to access this data in the “Viewing Application Logs” section below.
Kubernetes Cluster Logs
Several of the components that form Kubernetes itself generate their own logs:
These logs are usually stored in files under the /var/log directory of the server on which the service runs. For most services, that server is the Kubernetes master node. Kubelet, however, runs on worker nodes.
If you’re experiencing a cluster-level problem (as opposed to one that impacts just a certain container or pod), these logs are a good place to look for insight. For example, if your applications are having trouble accessing configuration data, you could look at Etcd logs to see if the problem lies with Etcd. If a worker node is failing to come online as expected, its Kubelet log could provide insights.
Kubernetes keeps track of what it calls “events,” which can be normal changes to the state of an object in a cluster (such as a container being created or starting) or errors (such as the exhaustion of resources).
Events provide only limited context and visibility. They tell you that something happened, but not much about why it happened. They are still a useful way of getting quick information about the state of various objects within your cluster.
Kubernetes Audit Logs
Kubernetes can be configured to log requests to the Kube-apiserver. These include requests made by humans (such as requesting a list of running pods) and Kubernetes resources (such as a container requesting access to storage).
Audit logs record who or what issued the request, what the request was for, and the result. If you need to troubleshoot a problem related to an API request, audit logs provide a great deal of visibility. They are also useful for detecting unusual behavior by looking for requests that are out of the ordinary, like repeated failed attempts by a user to access different resources in the cluster, which could signal attempted abuse by someone who is looking for improperly secured resources. (It could also reflect a problem with your authentication configuration or certificates.)
How to Access Kubernetes Log Data
The various types of log data described above can be accessed in different ways.
Viewing Application Logs
There are two main ways to interact with application log data. The first is to run a command like
kubectl logs pod-name
where “pod-name” is the name of the pod that hosts the application whose logs you want to access.
The kubectl method is useful for a quick look at log data. Suppose you want to store logs persistently and analyze them systematically. In that case, you’re better served by using an external logging tool like IBM Log Analysis with LogDNA to collect and interpret the logs. The easiest way to go about this is to run a so-called sidecar container, which runs alongside the application, collects its logs, and makes them available to an external logging tool. On IKS, you can set up a LogDNA instance to perform this function for application logs (as well as for logs associated with Kubernetes itself) from the command line or by following a few steps in the IKS Web Console. For full instructions, check out the IBM Cloud documentation.
Viewing Cluster Logs
There are multiple ways of viewing cluster logs. You can simply log into the server that hosts the log you want to view (as noted above, that’s the Kubernetes master node server in most cases) and open the individual log files directly in a text editor, less, cat, or whatever command-line tool you prefer. Or, you can use journalctl to retrieve and display logs of a given type for you.
The most user-friendly solution is again to use an external logging tool like IBM Log Analysis with LogDNA. As noted above, IBM Cloud’s integration with LogDNA makes it easy to collect Kubernetes cluster logs and application logs and analyze them through a centralized interface, without having to worry about the tedious process of collecting individual logs from each of your nodes through the command line.
You can view Kubernetes event data through kubectl with a command like
kubectl get events -n default
where the “-n flag” specifies the namespace whose events you want to view (default in the example above). The command
kubectl describe my-pod
will show you events data for a specific pod.
Because the context of events data is limited, you may not find it very useful to log all events. However, you can always redirect the CLI output from kubectl into a log file and then analyze it with a log analysis tool.
Viewing Audit Logs
To a greater extent than is the case for other types of Kubernetes log data, the way you view and manage audit logs varies significantly depending on which Kubernetes distribution you use and which log collector you want to use to collect these logs. There is no generic and straightforward way to collect audit logs directly from kubectl.
On IKS, audit events are routed to a webhook URL. From there, you can collect logging data with IKS’s native LogDNA integration by following these instructions.
How to Build a Kubernetes Logging Solution
As we saw above, there are multiple types of log data available in Kubernetes, but there are also various approaches for accessing it. Devising the Kubernetes logging strategy and toolset that work best for you requires weighing several factors:
- Which types of Kubernetes logs do you need to collect? Some types of data may be more or less important to you. For example, if you run simple applications that don’t generate meaningful monitoring data, application logs may be less important than Kubernetes cluster logs.
- What are your logging goals? If you just want to view a log file quickly, kubectl (or journalctl, in certain cases) will do the job. But, if you need longer-term log management, you’ll want an external log collector.
- Do you need visualizations? When you access a Kubernetes log file, are you looking just for specific pieces of information, like the source of an API request? Or do you want to be able to visualize data to recognize trends or compare logs? In the latter case, an external tool or combination of tools that provide log collection and visualization is needed.
- Do you need to aggregate logs? Is your goal to access individual log files, or do you want to aggregate multiple logs together and analyze them collectively? You need external tools to do the latter.
- How long do you need to retain logs? Most of the log data stored within Kubernetes is deleted after a certain period of time, which varies depending on the type of log and your logging configurations. Therefore, if you want to hold onto historical log data for the long term, you’ll need to export it to an external logging platform.
There is a lot of nuance surrounding logging in Kubernetes. Although Kubernetes offers some basic built-in logging and monitoring functionality, it’s a far cry from a full-fledged logging solution. To get the most out of Kubernetes logging, you’ll need an external log collection, analysis, and management tool like LogDNA — which, as noted above, is very easy to set up on Kubernetes distributions like IKS, where it is one of the officially supported logging solutions.
This post is part of a larger series that explores the difference between logging for Kubernetes and logging for Red Hat OpenShift. Download the full eBook here.