Socket Adds ChatGPT to Its Vulnerability Detection Arsenal
It’s too hard for developers to do the right thing, according to Feross Aboukhadijeh, an avid open source maintainer and founder of Socket, a startup focused on the security of open source components. It’s found ChatGPT a good fit for ferreting out the types of vulnerabilities those components contain.
Too often a developer has a Friday deadline to meet and grabs an open source package or library to get the job done faster without fully understanding the provenance of that code or the many dependencies within it.
He said he was amazed as a twentysomething random person writing code and putting it on GitHub that large companies were incorporating his libraries into their offerings. He had a friend who was writing code from a boat out in the ocean and publishing it to GitHub from his phone.
“So I started seeing stuff like that, and I realized, wow, honestly, it’s kind of cool. It’s the power of open source, all these different people that don’t even have to know each other, but they can all work together. So it was cool, but it was also kind of worrying,” Aboukhadijeh said during an interview at Open Source Summit North America.
A package might not be maintained anymore. The maintainer might have dropped it and someone else took it on. That someone might have added something malicious.
The recent supply chain attacks led to the Executive Order on Improving the Nation’s Cybersecurity and the call for software bills of materials (SBOMs) so organizations know the “chain of evidence” for all that makes up their applications. Also recently, the Open Source Security Foundation (OpenSSF) released Supply-chain Levels for Software Artifacts (SLSA, pronounced Salsa) to ensure the software remains tamper-proof and can be securely traced to its source.
The Bay area startup Socket detects more than 70 signals of supply chain risk in open source code. It flags potential problems — sometimes it’s hard to tell whether a piece of code is malicious — so issues still require human review. Depending on how a company wants to configure the tool, some actions can be blocked straightaway.
Aboukhadijeh likens it to the nutrition labels on food packages: It tells what’s inside, then it’s up to the consumer to decide whether it’s a good choice.
He said he believes developers want to do the right thing, they just don’t have time to inspect every line of code in every dependency.
He maintains that about half the npm repository contains projects of questionable nature, including Binky, which purports to provide the “ability to precisely determine the binkiness of objects” and as of January had 11,460 versions in the public npm registry. Then there’s a group of Chinese developers calling itself “ApacheCN” using npm to store thousands of ebooks and magazines, apparently attempting to bypass government censors. Developers need a reliable way to filter out unreliable or suspicious open source code, he said.
Using ChatGPT’s Strengths
To that end, Aboukhadijeh said he’s found ChatGPT is quite adept at it. Parsing code with static analysis and graph queries has proved challenging, and using humans to do it is wildly expensive and doesn’t scale. And so much changes constantly.
After about four months of comparing ChatGPT with humans, Socket released its ChatGPT-based source code analysis tool for npm and PyPI packages. When a potential issue is found it asks ChatGPT to summarize its findings. It’s not setting ChatGPT as a default on blocking issues as yet until it gathers more feedback.
“The first thing that is the most impressive is that it actually can synthesize information from different sources and then produce an analysis,” he said, explaining that it’s good at the sort of low-level look at code and determining whether it’s malicious.
The second thing is that it can summarize findings and explain the significance of them in plain language so that non-experts can understand.
One of the more common concerns when in a security risk situation is what data can be extracted, combining capability analysis that AI is able to detect and explain when this occurs, he said. For example, mathjs is a popular package sporting 500k downloads weekly, a copycat mathjs-min (now reported and removed) was caught by AI with the following analysis:
“The script contains a discord token grabber function which is a serious security risk. It steals user tokens and sends them to an external server. This is malicious behavior.”
Socket’s ChatGPT tool has some of the same limitations of other AI systems, such as dealing with extremely large files due to its limited context window and has difficulty with highly obfuscated code. Since these issues are suspicious anyway and would require a human to take a look, the company doesn’t consider these huge drawbacks. It’s still working on the limitations posed by cross-file analysis and mitigating emerging threats like prompt injection, which specifically target AI systems.
To boost its static analysis capabilities, it’s looking to further integrate Large Language Models into its systems for more complex analysis.
An App Between Developer and GitHub
Aboukhadijeh, also a part-time lecturer on web security at Stanford University, launched Socket in 2020 amid the pandemic. The fully remote company has grown to 10 people, largely open source maintainers, and raised a $4.6 million seed round in May 2022.
The company originally focused on typo squat detection but added new features with its 1.0 release of Socket for GitHub in June 2022. Installed in GitHub repositories, the app analyzes open source components in real time, detects changes and flags potential issues in the tool where the developer is working.
It’s also working toward the 1.0 release of Socket CLI, an alternative to using Socket on GitHub and has its npm wrapper called “safe npm” in beta. It protects against 11 issues including typo squats, install scripts, protestware, telemetry and more.
It also recently released Project Health Reports. Unlike the real-time alerts, these reports help security teams understand the holistic supply chain risk of repositories within the organization. They provide a full list of dependencies used in a project and their associated risks.
Socket customers include Firma, Vercel, the BBC, and others, including “a large Canadian telecom” he wouldn’t name.
Where Developers Work
Application security is a crowded space with the likes of Snyk, Guardrails and SonarQube.
Aboukhadijeh argues the big security companies focus on the big, known vulnerabilities when there are plenty of smaller issues in even popular software libraries, frameworks and packages that can give bad actors a way into your systems.
“We’re coming at it from a developer’s perspective, which is pretty new,” he said. “Everybody’s everyone else is really like security folks, coming from it from a security background and they don’t really know what developers would encounter,” he explained, about putting the flags into the tools where developers are already working.
“The security team will, like, once a quarter go through and they’ll scan all the developers’ work from the last quarter. And then they’ll send them an email and say we found 25 problems in your code that were committed over the last quarter. Now go fix them all, please. And the developers like, ‘I did that two months ago. Why didn’t you tell me earlier?’ They’ve forgotten about what they even did back then.
“We help them get there earlier … when they’re actually trying to make a decision about what to do, then we can help them not even make the mistake in the first place.”