In order to keep business and customer data safe, cloud workloads use security features like authentication and encryption. These features require cryptographic keys and passwords that must be kept secret. Secure software development best practices suggest avoiding keeping such materials in software distribution and deployment images — like containers, VMs, config files, and so on. However, a 2019 NDSS study (PDF) shows that people don’t always follow even the basic guidelines.
While many tools, as well as verification and compliance processes, are being developed to help organizations address trivial human errors, software architects and developers must implement their products so that trivial deployment mistakes are not possible. Their solutions must be secured by design.
This, however, is easier said than done. On one hand, we have to deal with a constantly growing number of software vulnerabilities (some of which are not publicly known, a.k.a. 0-day) and on the other hand attacking techniques and methodologies are becoming more and more sophisticated and much better organized. Going after cryptographic secrets and passwords is a sweet spot for the majority of cyberattacks aiming to obtain valuable information.
This article examines the security aspects of keys and secret material distribution in cloud native environments, where software is split into hundreds or even thousands of microservices and workloads sporadically running in huge computation clusters, such as Cloud VPCs, Kubernetes and Spark. For the sake of efficiency of infrastructure utilization, it is becoming increasingly prevalent to spread these clusters over multiple infrastructures — such as multiple cloud vendors, and public and private data centers. Contemporary architectures cannot rely on a specific infrastructure capability; but even if they do, such capabilities do not always provide an optimal answer when it comes to the distribution of and — most importantly — protection of cryptographic secrets.
Assuming best practices were followed and a workload starts up on one of the previously unknown computation nodes, it now requires one or more cryptographic keys to be provided in order to perform its task. While not “in use,” the keys are kept in some type of a Key Management System (KMS) — usually provided by a cloud vendor or an independent third party. Workloads communicate with a KMS over IP interfaces and may use standard KMIP or proprietary interfaces. In this article, I won’t question the security of the KMS implementation, but rather focus on the security aspects of the key distribution process.
Who Is Asking for a Key?
A workload must issue a request to the KMS in order to obtain key(s). However, there might be many workloads that communicate with the KMS all the time. The KMS must be able to identify a workload, to decide what key to provide to it. There are several techniques to do this; for example key tokens, key IDs, or IAM tokens.
Some of these techniques are infrastructure specific, others are neutral — but they all share a common conceptual problem. The KMS uses request tokens to identify the key, rather than to authenticate the requester. These tokens can be seen as a key to a safe, where the actual key is kept. The best practices require us to avoid keeping the keys with the software, because we know how easy it will be to steal these keys.
So then, how is a “key to the key” better? Well, it is better than the actual key — but not by much. It would require an attacker to make an extra round trip to the KMS and maybe even exploit some software vulnerability in order to attack from within the cluster, when there is no external access to the KMS. But this hasn’t stopped attackers before. The conceptual problem here is that the KMS cannot authenticate a key requesting workload.
Do Not Confuse Identification and Authentication
The key identifiers or tokens are perfectly sufficient to identify what key is requested. However, since they can be copied, the KMS has no means to make sure that a request comes from a legitimate workload. Normally, such a problem should have been solved by authentication; but the authentication must be based on something that can’t be stolen or copied. Which brings us back to secret materials. Since workloads start without any secret materials, reliable authentication cannot be performed — which enables potential attackers to pose as legitimate workloads and obtain the keys right from the KMS.
Because practically all KMS products are well known, the attackers know exactly what the tokens look like and where they are kept; and therefore how to operate. We refer to this as a “chicken and egg” problem of the key distribution, because while a workload needs to obtain secret materials from a KMS in order to operate, a “perfect” KMS should securely authenticate the requesting workload. Which requires that workload to have an authentication secret in the first place.
How to Solve this Deadlock?
Since secret materials cannot be used to authenticate the requesting workload, we should find another authentication method that may allow a KMS to make sure it is communicating with a legitimate requester. A human biometrical authentication approach suggests another way to authenticate the requesting workloads — software “DNA.” A cryptographic signature of the workload software requesting a key from a KMS provides a missing authentication factor, ensuring that the KMS communicates with a legitimate client. If the KMS was able to remotely verify such a cryptographic signature in runtime, as part of the key distribution protocol, it would prevent malicious software from obtaining cryptographic keys that belong to other workloads.
This method will still require additional measures to be in place to avoid legitimate software cloning. Also, once delivered, the keys must be protected while they are “in use.”
Feature image via Pixabay.