What You Need to Know About the RunC Container Escape Vulnerability
A serious vulnerability was disclosed in runC, a runtime that underpins Docker and other Linux container engines. If left unpatched, it allows hackers to break out of sandboxes and gain root access on the host servers, compromising the entire infrastructure.
The vulnerability is tracked as CVE-2019-5736 but is also referred to as Runcescape. It was discovered by Polish researchers Adam Iwaniuk and Borys Popławski after they got the idea to investigate namespace-based sandboxes following a capture-the-flag security competition.
“We considered the two following attack vectors: a malicious Docker image [and] a malicious process inside a container (e.g. a compromised Dockerized service running as root),” the researchers said in a blog post, Wednesday. “Results: we have achieved full code execution on the host, with all capabilities (i.e. on the administrative ‘root’ access level), triggered by either running ‘docker exec’ from the host, on a compromised Docker container [or by] starting a malicious Docker image.”
However, since runC is the plumbing that ties most container engines to the Linux kernel’s sandboxing features, Docker is not the only container platform affected by the flaw. RunC is also used to spawn and run containers by containerd, Podman, and CRI-O.
According to the runC maintainer Aleksa Sarai, LXC is also impacted by a variation of the same flaw. However, the vulnerability can only be exploited on privileged containers, which the LXC project considers unsafe by default.
The attack involves overwriting a binary inside the container, for example, the /bin/bash shell, with a /proc/self/exec symbolic link which the kernel creates automatically for running processes, in this case runC. So when the /bin/bash binary gets executed in the container, the runtime will actually execute a symbolic link pointing back to itself. The attacker can then attempt to overwrite the file with a malicious version in a loop using a file descriptor and this will eventually succeed when the runC process exits, because it cannot be overwritten while running.
“To prevent this attack, LXC has been patched to create a temporary copy of the calling binary itself when it starts or attaches to containers,” the LXC maintainers explained. “To do this LXC creates an anonymous, in-memory file using the memfd_create() system call and copies itself into the temporary in-memory file, which is then sealed to prevent further modifications. LXC then executes this sealed, in-memory file instead of the original on-disk binary.”
Aleksa Sarai said that exploitation of this vulnerability is not blocked by the default AppArmor or SELinux policies on some distributions like Fedora. However, it is blocked by the correct use of user namespaces where the host root in not mapped inside the container’s user namespace.
According to Red Hat, attempts to exploit the vulnerability are mitigated by the default SELinux policy in Red Hat Enterprise Linux and Red Hat OpenShift because on those systems the policy is set to targeted enforcing mode and is rarely disabled for containerized environments. Security-Enhanced Linux (SELinux) is a kernel security module that enables access control security policies.
The vulnerability has been rated 7.2 out of 10 in the Common Vulnerability Scoring System. This indicates a high severity instead of critical because it cannot be exploited remotely, but as far as sandboxes and containers go, it’s as bad as it gets. That’s because it breaks the fundamental security isolation layer that containers are meant to provide and allows an attacker who gains access to a single privileged Docker image to compromise all other services running on the underlying server.
The Shodan search engine shows around 4,000 Docker deployments exposed to the Internet, the majority of them being hosted on Amazon’s cloud computing infrastructure. Last year, a study by security firm Lacework found over 22,000 publicly exposed container orchestration and API management systems, about 300 of which could be accessed without any credentials.
“This isn’t the first major flaw in a container runtime to come to light and, as container deployments and interest in associated technologies increase, it’s unlikely to be the last,” said Scott McCarty, principal product manager for containers at Red Hat. “Just as Spectre/Meltdown last year represented a shift in security research to processor architectures from software architectures, we should expect that low-level container runtimes like runc and container engines like docker will now experience additional scrutiny from researchers and potentially malicious actors as well.”
Proof-of-concept exploit code for the vulnerability is expected to be released next week, but hackers probably have enough information to start launching attacks even before then. Organizations should review their container deployments and apply the necessary patches as soon as possible.
Amazon has released updated Docker packages and Amazon Linux AMI images for a variety of services including Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Service for Kubernetes (EKS), AWS Fargate, AWS IoT Greengrass, AWS Batch, AWS Elastic Beanstalk, AWS Cloud9, AWS SageMaker, AWS RoboMaker and AWS Deep Learning.
“This is the first major container vulnerability we have seen in a while and it further enforces the need for visibility of your hosts and containers both in the cloud and traditional data centers using docker and other containers,” said Dan Hubbard, chief product officer at Lacework, via email. “Security here starts with deep visibility into who is installing containers and what are their behaviors and, of course, timely patching.”
Red Hat is a sponsor of The New Stack.