More than half of open-source projects contain code written in a memory-unsafe language, a report from the U.S.’s Cybersecurity and Infrastructure Security Agency has found. Memory-unsafe means the code allows for operations that can corrupt memory, leading to vulnerabilities like buffer overflows, use-after-free and memory leaks.
The report’s results, published jointly with the FBI, Australian Signals Directorate’s Australian Cyber Security Centre, and Canadian Cyber Security Center, are based on analysis of 172 critical projects defined by the OpenSSF’s Securing Critical Projects working group.
Out of the total lines of code for these projects, 55% were written in a memory-unsafe language, with the larger projects containing more. Memory-unsafe lines make up more than a quarter of all of the 10 largest projects in the data set, while the median proportion among them is 62.5%. Four of them are made up of more than 94% memory-unsafe code.
What are memory-unsafe languages?
Memory-unsafe languages, like C and C++, require developers to manually implement rigorous memory management practices, including careful allocation and deallocation of memory. Naturally, mistakes will be made, and these result in vulnerabilities that can allow adversaries to take control of software, systems and data.
On the other hand, memory-safe languages, like Python, Java, C# and Rust, automatically handle memory management though built-in features and shift the responsibility to the interpreter or compiler.
SEE: The 10 Best Python Courses Worth Taking in 2024
The report’s authors wrote: “Memory safety vulnerabilities are among the most prevalent classes of software vulnerability and generate substantial costs for both software manufacturers and consumers related to patching, incident response, and other efforts.”
They also analysed the software dependencies on three projects written in memory-safe languages, and found that each of them depended on other components written in memory-unsafe languages.
“Hence, we determine that most critical open source projects analysed, even those written in memory-safe languages, potentially contain memory safety vulnerabilities,” wrote the authors.
Chris Hughes, the chief security advisor at open source security company Endor Labs and cyber innovation fellow at CISA, told TechRepublic: “The findings certainly pose a risk to both commercial organisations and government agencies because of the prevalent exploitation of this class of vulnerabilities when we look at annual exploitation across classes of vulnerabilities. They are often among the most commonly exploited class of vulnerabilities year-over-year.”
Why is memory-unsafe code so prevalent?
Memory-unsafe code is prevalent because it gives developers the ability to directly manipulate hardware and memory. This is useful in instances where performance and resource constraints are critical factors, like in operating system kernels and drivers, cryptography and networking for embedded applications. The report’s authors observed this and expect it to continue.
Developers might use memory-unsafe languages directly because they are unaware of or unbothered by the risks. They can also intentionally disable the memory-safe features of a memory-safe language.
However, those aware of the risks and who do not wish to incorporate memory-unsafe code might do so unintentionally through a dependency on an external project. Performing a comprehensive dependency analysis is challenging for a number of reasons, making it easy for memory-unsafe dependencies to slip through the cracks.
For one, languages often have multiple mechanisms to specify or create dependencies, complicating the identification process. Furthermore, doing so is computationally expensive, as sophisticated algorithms are required to track all the potential interactions and side effects.
“Somewhere underneath every programming language stack and dependency graph, memory-unsafe code is written and executed,” the authors wrote.
SEE: Aqua Security Study Finds 1,400% Increase in Memory Attacks
Hughes told TechRepublic: “Often, these (memory-unsafe) languages have been widely adopted and used for years before much of the recent activity to try and encourage the transition to memory safe languages. Additionally, there is a need for the broader development community to transition to more modern memory safe languages.
“It would be difficult to change many of these projects to memory safe languages because it would require resources and efforts from the maintainers, to refactor/rewrite to memory safe languages. The maintainers may not have expertise in the memory safe language and even if they do, they may not be incentivized to do so, given they are largely unpaid volunteers not being compensated for the projects they’ve created and maintained.”
He added that organisations should offer monetary incentives and other resources to encourage open-source developers to transition their code, but also need to monitor any efforts to ensure that secure coding practices are implemented.
Recommendations to reduce risks of memory-unsafe code
The report refers to CISA’s The Case for Memory Safe Roadmaps document and the Technical Advisory Council’s report on memory safety for recommendations on how to reduce the prevalence of memory-unsafe languages. These recommendations include:
- Transition existing projects to memory-safe languages, as recent advancements mean they now parallel the performance of memory-unsafe languages.
- Write new projects in memory-safe languages.
- Create memory-safe roadmaps that include clear plans for integrating memory-safe programming into systems and addressing memory safety in external dependencies.
- Manage external dependencies by ensuring third-party libraries and components are also memory-safe or have mitigations in place.
- Train developers in memory-safe languages.
- Prioritise security in software design from the beginning of the software lifecycle, such as by adhering to Secure by Design principles.
Efforts from officials to reduce prevalence of memory-unsafe code
Federal officials and researchers in the U.S. have been working to reduce the amount of memory-unsafe software in circulation in recent years.
An October 2022 report from Consumer Reports noted that “roughly 60 to 70 percent of browser and kernel vulnerabilities — and security bugs found in C/C++ code bases — are due to memory unsafety.” Then, the National Security Agency released guidance for how software developers could protect against memory-safety issues.
In 2023, CISA Director Jen Easterly called on universities to educate students on memory safety and secure coding practices. The 2023 National Cybersecurity Strategy and its implementation plan were then published, which discussed investing in memory-safe languages and collaborating with the open source community to champion them further. That December, CISA published The Case for Memory Safe Roadmaps and the Technical Advisory Council’s report on memory safety.
In February this year, the White House published a report promoting the use of memory-safe languages and the development of software safety standards, which was backed by major technology companies including SAP and Hewlett Packard Enterprise.
The U.S. government’s efforts are being supported by a number of third-party groups that share their aim of reducing the prevalence of memory-unsafe code. The OpenSSF Best Practices Working Group has a dedicated Memory-Safety Special Interest subgroup, while the Internet Security Research Group’s Prossimo project wants to “move the Internet’s security-sensitive software infrastructure to memory safe code.” Google has developed the OSS-Fuzz service that continuously tests open-source software for memory-safety vulnerabilities and other bugs using automated fuzzing techniques.