Security of Software Update Systems in 2023
The security of software repositories is essential to software security since repositories serve such a large portion of the software in use today. Repositories are a popular source to download software for the first time and the de facto means to retrieve patches and new releases for fixes and enhancements. Given their ubiquity, a primary concern is how a software repository responds in the unfortunately common case of a repository compromise.
Let’s look at a few of the popular security techniques used in different technologies that are peaking at this time, as well as security concerns that are not being adequately addressed and problems that seem to be overblown.
Three Things Not Being Appropriately Considered:
1. Failure to namespace keys securely. The fundamental problem is that many systems just focus on who is trusted and who is not, rather than focusing on which key should be trusted to do specific actions. This leads to situations where possession of a series of keys confers trusted access to do anything in the system equally. This is a clear violation of least privilege, because if any trusted user is compromised, loses their key or becomes malicious, etc., it can lead to a massive loss of security for all users.
For example, take a software repository like PyPI that allows a user to upload GPG (Gnu Privacy Guard) keys. PyPI tells a user who downloads a package which GPG keys are supposed to be trusted for this package. Note that PyPI can handle situations where a GPG key needs to be replaced by instead distributing the new key. Unfortunately, this means that an attacker who compromises PyPI could tell all users to trust the attacker’s key for a package instead.
2. No means to adequately expire old metadata. Just because a package or image was valid at one point does not mean that it is valid and should be installed later. This is a common situation when you discover a vulnerability in an existing package and want to make sure that the package is not installed in the future. Unfortunately, this is common since vulnerabilities are discovered all too often.
It’s critical that the security system flag that a signature that was valid a few weeks or a month ago is not necessarily valid for the signing key’s entire lifetime. There are several mechanisms to accomplish this, including functionality like the snapshot role inside of TUF (The Update Framework)or having a form of transparency log as is used in Rekor. While these features add overhead to the system, failing to securely expire old metadata can result in attacks on your repository compromising your user base due to the continued consumption of packages with known exploitable vulnerabilities.
3. Should keys be online or offline? It’s tempting to simply trust online services to always behave as they should. Unfortunately, historically, this has been a bad assumption. A system design meant to be resilient to attack needs to resist the concerted threat of sophisticated attackers who can compromise underlying infrastructure including servers and storage to gain access to keys.
Keeping keys offline is not only a valuable strategy to mitigate the threat of access and exfiltration of keys by an external attacker but also helps to minimize the possibility of an insider or a malicious provider misusing keys. If the use of offline keys is rare, then changes can involve procedures like setting a threshold of physical keys, where a number of offline keys are needed for decryption, and any attempts using less than that number can be considered out of the norm. In situations with offline keys, additional scrutiny in auditing is easier to apply, which helps to detect insider misuse and mishandling of key material.
Three Overblown Concerns:
However, not all concerns have been shown to be equally valid. Some items have a substantial amount of hype without the corresponding studies to back them up. Three concerns that seem to be overblown:
1. Fork-* consistency. In theory, an attacker could gain control of a repository and serve different versions of the same files to different participants. For example, to target computers at a specific company, it could serve files of a different version. Note that this goes way beyond having different mirrors serve files in different regions, which is a common technique used for both benign and malicious reasons. A true fork-* consistency attack would enable different users to write files and then the repository would equivocally show different states to different users.
In practice, however, this attack has never been observed due both to the need to keep complex state trees and the need to segregate users permanently and reliably. This is likely due to the fact that it requires an attacker to control a repository for a long period of time and while doing so to launch a fairly selective attack among groups of users. Those users must not communicate, or, if they do, this enables the attack to be discovered. As a result of all of these logistical challenges, it does not appear that this attack is a major concern in practice since there are no documented instances of it.
Note that many systems do provide protection against this sort of attack in different scenarios. For example, TUF’s use of the snapshot metadata will enable any client who receives different, discordant metadata from a source to detect this as an attack. So as a result, if one were trying to segment all of the users in a geographical zone and a TUF user moved between zones with their laptop, this would immediately be flagged as an attack.
Another means of addressing this is via a centralized transparency service like Rekor that ensures no deletion and no equivocation in the logs. Since there is a single Rekor log that presents all parties with the same information, unless Rekor itself is compromised, this sort of attack not only would be evident, but altogether not possible.
2. The value of ephemeral keys. One of the recent hot topics is the use of ephemeral keys inside security systems. This provides a way for a developer to avoid managing a key. While undoubtedly they have good usability properties, one of the other major benefits touted for ephemeral keys is security since an attacker who later compromises a system using them does not necessarily gain access to the private key of the developer using that system.
However, for a lot of reasons, the security benefit is not as straightforward as it might seem. In practice, many key types are protected with passphrases or physical keys. If done correctly, this effectively applies the same protection as the use of ephemeral keys. There are also security and usability concerns with ephemeral keys. One is that it is common when a project’s ownership changes to go through a process where you change or hand over a key to another party.
Ephemeral keys make this a lot more challenging, especially in the case where you wish to hand over a key, since this now involves handing over the sign-in for a valuable account. As a result, systems that attempt to do handover need to have key rotation for ephemeral keys designed very carefully.
A final note is while the number of times a key is stolen off a developer machine is not negligible, there are few known incidents in which this defense would have protected against it. It would be nice to see more clear documentation of examples when ephemeral keys would have provided protection.
3. Homogeneous distributed/decentralized systems as a way to add redundant security checks. There are a variety of different systems that have a series of homogeneous nodes that perform the exact same security verification en masse. These are situations where many, but not all, systems will perform a verification process, then clients will ask some set of those systems about the result. Some examples of this include observers of transparency logs and most decentralized/cryptosystems.
While there is clearly a set of attacks that these systems will detect, these defenses are mostly limited to situations where one of the underlying homogenous servers is compromised. This begs a few questions. Given some set of a homogenous set of parties is compromised, how likely is it that a significant threshold of them is compromised? We surmise the answer to this first question is that it is quite high in practice. This is because they often will run the same software stack, and thus inherit the same vulnerabilities and implementation quirks. Another question is if one is able to do something malicious and fool a homogeneous observer, how likely is it that other observers will catch the problem? We posit that this is not all that likely, as there are not a substantial number of known cases in practice. Finally, in practice, there are not that many success cases for this model.
While systems that are heterogeneous and appropriately compartmentalize trust into different entities have a long history of success, the same cannot be said for homogeneous verification. Instead, there are quite a few cases where, because there could be such a large set of potential verifiers, verification has not been effectively done. Clients are often unsure about how many verifiers to rely on and how to configure a system to get the right trust and efficiency properties. Yet another concern is what to do when some of the observers disagree.
Note that this does not necessarily mean that an attack is underway. While it could be that some actors are being malicious and falsely representing information, there could also be some corner cases in the data they have received, their way of interpreting it or the time at which the requests were received, which causes a different outcome from different observers. Hence systems with homogeneous distributed observers do not represent a panacea, but instead present a new challenge, which needs to be carefully studied and observed before relying on it.
We’d like to generate discussion about the security of software update systems, leaving you with three questions: Are there important concerns that we did not mention? What techniques can best address these issues? Is there important evidence for or against our claims that we missed?
Remember to build compromise-resilience into your systems, to use the least privilege and to reduce the impact of a compromise of online services and keys. Stay safe!