Can C++ Be Saved? Bjarne Stroustrup on Ensuring Memory Safety
There’s turmoil in the C++ community. In mid-January, the official C++ “direction group” — which makes recommendations for the programming language’s evolution — issued a statement addressing concerns about C++ safety. While many languages now support “basic type safety” — that is, ensuring that variables access only sections of memory that are clearly defined by their data types — C++ has struggled to offer similar guarantees.
This new statement, co-authored by C++ creator Bjarne Stroustrup, now appears to call for changing the C++ programming language itself to address safety concerns. “We now support the idea that the changes for safety need to be not just in tooling, but visible in the language/compiler, and library.”
The group still also supports its long-preferred use of debugging tools to ensure safety (and “pushing tooling to enable more global analysis in identifying hard for humans to identify safety concerns”). But that January statement emphasizes its recommendation for changes within C++.
Specifically, it proposes “packaging several features into profiles” (with profiles defined later as “a collection of restrictions and requirements that defines a property to be enforced” by, for example, triggering an automatic analysis.) In this way the new changes for safety “should be visible such that the Safe code section can be named (possibly using profiles), and can mix with normal code.”
And this new approach would ultimately bring not just safety but also flexibility, with profiles specifically designed to support embedded computing, performance-sensitive applications, or highly specific problem domains, like automotive, aerospace, avionics, nuclear, or medical applications.
“For example, we might even have safety profiles for safe-embedded, safe-automotive, safe-medical, performance-games, performance-HPC, and EU-government-regulation,” the group suggests.
Elsewhere in the document they put it more succinctly. “To support more than one notion of ‘safety’, we need to be able to name them.”
But the proposed changes echo thoughts that emerged in a kind of showdown in December with the federal government. The mid-January statement notes concerns raised about the safety of C++ by a particularly heavy-hitting organization: the U.S. Department of Commerce’s influential National Institute of Standards and Technology. And in November, America’s National Security Agency also called out C++ in an information sheet on software memory safety (as part of its mission to identify threats to various federal systems and “issue cybersecurity specifications and mitigations.”)
Maybe it was that high-level concern that ultimately planted the seeds of change…
A National Security Issue
The NSA had cited estimates from Microsoft and Google that, over several years, roughly 70% of vulnerabilities come from memory safety issues. They followed this with a warning that these simple programmer mistakes can allow attackers to access sensitive information or even execute unauthorized code that leads to large-scale network intrusions. So whether it’s overflowing a memory buffer or memory allocation vulnerabilities, race conditions or uninitialized variables — “all of these memory issues are much too common occurrences.”
Yes, software analysis tools and “operating environment options” can spot many of the issues, but the NSA had still recommended, “when possible,” to just use a memory-safe language instead.
To be clear, they defined this as a language where through run-time and compile-time checks, memory “is managed automatically as part of the computer language; it does not rely on the programmer adding code to implement memory protections.” The NSA provided as its examples: C#, Go, Java, Ruby, Rust, and Swift.
Responding in December on the Open Standards website, Stroustrup had countered that he doesn’t consider those languages superior to C++ “for the range of uses I care about.”
Stroustrup also objected that the NSA’s discussion of safety “is limited to memory safety, leaving out on the order of a dozen other ways that a language could (and will) be used to violate some form of safety and security… There is not just one definition of ‘safety’, and we can achieve a variety of kinds of safety through a combination of programming styles, support libraries, and enforcement through static analysis.”
Along the way, Stroustrup also made a second argument: that in some real-world scenarios where performance is paramount, “Not everyone prioritizes ‘safety’ above all else.” So Stroustrup argued that the “sensible” thing to do is to make a list of safety issues (including undefined behavior), then find ways to prevent them as needed using pre-execution debugging tools (like static analyzers).
The fact that Bjarne & co seem to think that you can fix “safety” in C++ with static analysis shows how completely out of touch they are with security research. https://t.co/HI3k6hxfun
— Patricia Aas 🐢🏳️🌈 (@pati_gallardo) January 22, 2023
Along those lines, Stroustrup had already been calling for both compiler options and code annotations for C++ that request type safety (and resource safety), saying this “lets you apply the safety guarantees only where required and use your favorite tuning techniques where needed….”
The newly-proposed “profiles” seem like an in-language way of accomplishing just that.
Safety in C++
Stroustrup also objected to C++ being lumped in with C in the NSA’s document. He pointed out that even now “The C++ Core Guidelines specifically aims at delivering statically guaranteed type-safe and resource-safe C++ for people who need that without disrupting code bases that can manage without such strong guarantees or introducing additional toolchains.”
And those Core Guidelines are already supported by Microsoft’s Visual Studio analyzer (and its memory-safety profile), as well as many static analyzers. (Stroustrup also cites the linter Clang tidy, which he says has some support for the C++ core guidelines.) This approach allows C++ “to completely deliver those guarantees at a fraction of the cost of a change to a variety of novel ‘safe’ languages,” Stroustrup argued.
Stroustrup also cited another paper he wrote in 2021 which made the case that “Complete type-and-resource safety have been an ideal (aim) of C++ from very early on (1979) and is achievable though a judicious programming technique enforced by language rules and static analysis.” (Later Stroustrup writes that the solution is “a carefully crafted set of programming rules supported by library facilities and enforced by static analysis.”)
The paper acknowledged that on its own, “By default, the Core Guidelines do not provide complete type-and-resource safety” — but argued that it can be guaranteed by enforcing additional rules (“as implemented by the Core Guidelines checker distributed with Microsoft Visual Studio,” for example.) In a nod to Rust’s compiler-based type-checking, Stroustrup wrote that “The compiler is not our only tool, and has never been,” providing specific examples of the powerful checks that can be performed by a (pre-compilation) static analysis. For example, static analysis can:
- Prevent unsafe type conversions
- Prevent the creation of uninitialized objects
- Ensure no memory-referencing pointer “escapes” beyond its narrowly-defined scope to erroneously point to something else.
In December’s response to the NSA, Stroustrup wrote that we live in a world where “the billions of lines of C++ code will not magically disappear,” adding that instead it’s important to have a gradual adoption of these safety rules (and the adoption of different safety rules, where appropriate).
The NSA’s paper seemed to agree with some of this — to a point. The NSA paper included tips on “hardening” code written in a non-memory-safe language, recommending tools for both static analysis (examining the source code) and dynamic analysis (performed while the code is executing) — along with vulnerability correlation tools to simplify the results. “Working through the issues identified by the tools can take considerable work, but will result in more robust and secure code.”
And the NSA’s paper does note the “considerable protection” provided by “the use of added protections to non-memory safe languages”. (It also suggests hardening the compilation and execution environment through security features like Control Flow Guard, Address Space Layout Randomization, and Data Execution Prevention.)
A Long-Standing Design Goal
In a new interview for Honeypot’s “Untold Developer Stories”, 72-year-old Stroustrup looked back to his student days, when as a young man he’d discovered that he wasn’t as good at math as he thought he was — but that “machine architecture was really fun.”
But there was less to say in 2020 when someone asked Stroustrup what he’d change if he could go back in time. “That’s a time machine question, and we don’t have a time machine,” he replied.
“One of the interesting aspects of programming language design is that if you succeed, you have what you did many many years and decades ago, and you have to live with it. Once you get users, you have responsibilities, and one of the responsibilities is not to break their code… There’s a few hundred billion lines of C++ out there, and we can’t break them.”
Stroustrup stressed his faith in C++. “I think C++ can do anything Rust can do, and I would like it to be much simpler to use.” But he also said in that 2020 interview that basic type safety — ensuring variables access only their clearly-delineated chunks of memory — was one of his earliest design goals, and one he’s spent decades trying to achieve. “I get a little bit sad when I hear people talk about C++ as if they were back in the 1980s, the 1990s, which a lot of people do,” Stroustrup said in 2020.
“They looked at it back in the dark ages, and they haven’t looked since.”