How to Give Kubernetes Immunity from Privilege Escalation

This is part of a series of contributed articles leading up to KubeCon + CloudNativeCon on Oct. 24-28.
Security across virtualized and physical infrastructure environments has been refined over the past 20 years into a highly sophisticated craft that enables precise control over the access and availability of application workloads. The advent of Kubernetes, the distributed container runtime platform that sits atop of this infrastructure, introduced new challenges that required innovative methods to ensure the same level of security across the digital estate.
One of the cybersecurity pillars is application isolation through various forms of confinement, either through logical or physical separation. Kubernetes is able to leverage this practice, along with the ability to dynamically control the placement of application workloads, to enable new types of security patterns that previously were not thought possible. One such common pattern that has become more widely adopted within Kubernetes is multitenant architecture.
Traditional Segmentation
Historically this would be tackled by isolating physical components such as switches, compute and storage devices. Now, with logical isolation boundaries and strict role-based access control per workload, it is possible to allow Kubernetes constructs, such as namespaces, to assist in creating a shared environment that allows for privacy and security. When augmented by additional components like advanced container network interface and container storage interface providers, network transport and data at rest can also be confined to their workload applications purview.
The benefits of a segmented architecture are twofold: higher utilization per physical unit through increasing workload density, as well as homogenous configuration management. The latter is very important from a security perspective because if the system is constrained to a multitenant Kubernetes cluster, role-based access control policies (RBAC), entitlements (ACL) and security tooling can align in a ubiquitous manner. Furthermore, confining an environment by default benefits the security posture of Kubernetes itself. A practical example is running the Kubernetes control plane on heavily locked down and regulated nodes, disabling port access and privileged operations.
While these patterns are increasingly becoming commonplace thanks to improvements in Kubernetes’ security posture and overall awareness, bad actors are increasingly looking for weak entry points within the system. The weak link is increasingly becoming the Linux host operating system, which many operators are neglecting due to complexity and lack of domain-specific knowledge.
The Scary Reality of Container Escape
Containerized applications have given people the false impression of total workflow isolation. However, the reality is much different. From the initial 2019 Felix Wilhelm tweet outlining how the abuse of the cgroups release_agent
feature could lead to break out of a privileged Docker container or a Kubernetes pod, container escape has become one of the most widely-abused security vulnerabilities in attacks against DevOps tools.
Container escape could be extremely dangerous. Once an attacker accesses the host system, they can escalate their privilege to access other containers running in the machine or run harmful code on the host. Depending on how vulnerable the host is, the actor could also access other hosts in the network.
There are many avenues bad actors can choose to successfully carry out container escape attacks. However, they all boil down to two root causes: misconfiguration and excessive privileges.
While traditional countermeasures like implementing better authentication practices, privilege management, tiger network policies and patching can go a long way in making life harder for attackers, it is impossible to completely eradicate the threat. There will always be “that one application” requiring elevated privileges and new CVEs will always be discovered (for example, in the Linux kernel).
It is therefore very important to begin thinking about Kubernetes and the host OS as a single, integrated system and implement security controls at multiple layers of the stack.
Operating System Confinement to the Rescue
This means that the base OS needs to have a way to block Kubernetes or any other running application from accessing files, networks, processes or any other system resource without requesting that specific access.
This practice is known as confinement, and while the concept originated back to 1979 with Unix’s chroot system call and command, modern operating systems have taken the concept to new heights making access right requests explicit to the admin and using security features of the Linux kernel, including AppArmor, seccomp and namespaces, to prevent applications and services from accessing the wider system.
Confined host OS environments have emerged in response to this gap in security. While nascent, they are showing that they can offer increased cybersecurity resilience across the stack. When combined with traditional security practices at both physical, network, storage and virtualization layers, they ensure that there is a higher degree of confidence when running application workloads within Kubernetes. Should they escape their Kubernetes confinement, then the host OS that runs the container runtime has logical boundaries to block forms of takeover, access escalation or backdooring from the container itself.
To hear more about cloud native topics, join the Cloud Native Computing Foundation and the cloud native community at KubeCon + CloudNativeCon North America 2022 in Detroit (and virtual) from Oct. 24-28.