TNS
VOXPOP
Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
0%
At work, but not for production apps
0%
I don’t use WebAssembly but expect to when the technology matures
0%
I have no plans to use WebAssembly
0%
No plans and I get mad whenever I see the buzzword
0%
Open Source / Software Development

AI-Assisted Dependency Updates without Breaking Things

Startup Infield mixes continuous monitoring with aggregated upgrade data to provide a step-by-step path to safe open source component management.
Jan 22nd, 2024 10:30am by
Featued image for: AI-Assisted Dependency Updates without Breaking Things

Everyone wants to keep software up to date, but there’s fear that changing one thing will break something else, so those projects pile up. Then there’s this overwhelming backlog and it only gets worse.

That’s a problem that New York-based Infield is tackling with AI and data.

“We want to help software engineering teams keep all of their open source dependencies up to date, and we’re doing that by providing them all the information they need to avoid breaking production when they upgrade, because the No. 1 reason why developers let all these upgrades linger is that they’re scared that something is going to go wrong … I’m gonna break production by doing this upgrade … But if I just leave it alone, it’s not going to break,” said co-founder and CEO Steve Pike.

Infield’s solution involves continuous monitoring for recommended updates of open source components and a tool that provides a step-by-step guide for getting to the ideal status, which might involve updating various subcomponents in a particular order to avoid problems.

Complex Network of Interconnections

The average software application incorporates more than 500 open source components, according to application security vendor Synopsys. While larger companies might have a team dedicated to keeping components updated, most do not, and maintaining software takes engineers away from work on the core product and new features.

Foundation Capital, an investor in Infield’s recently announced $3 million seed round, put it this way:

OS software comprises 70-90% of any software solution today and each component requires regular updating for security, performance, and reliability. Yet 85% of codebases contain components that are more than four years out of date. Moreover, many dependencies rely on additional packages, resulting in transitive or chained dependencies. Updating one dependency can sometimes break the whole chain if not managed carefully. The technical term for this complex network of interconnections is ‘dependency hell.’”

Infield approaches it as a data problem.

“We know that these upgrades have been done before, but when people go to do them, they really are doing them for the first time,” Steve Pike said, adding that companies, in isolation, are reinventing the wheel.

“They’re a little universe, and then they never think about them again. And so what we’re doing is gathering up all of the unstructured information that’s out there about open source dependencies and their upgrades.”

That could be a changelog, notes from a maintainer about what’s changing in the code itself or other community-generated content on GitHub or elsewhere on the internet as well as data from the experiences of Infield users. Have they rolled it back often? Does it require making changes to code to do this upgrade safely? And Infield maintains its own database of undocumented incompatibilities.

“So all of this data about these upgrades, we’re storing, structuring, and then proactively surfacing to you when [you set out to upgrade],” Steve Pike said.

Taking a Data-Centric View

Cofounders Steve and Allison Pike met at SevenFifty, a tech startup that integrates data from more than 1,000 alcoholic beverage wholesalers to produce a dataset of products and prices for restaurants. Steve wrote the original code and became CTO while Allison was COO, selling a data product and working with data engineers. From there, they created Syndetic, which they described as a “Shopify for datasets” and went through Y Combinator. But COVID-19 intervened with lockdowns.

Andrew Lenehan

Andrew Lenehan

Allison Pike

Allison Pike

Steve Pike

Steve Pike

As Allison Pike tells it, “We never got to have our demo day that most YC companies get to have. We were stuck out in Mountain View.”

But the husband-and-wife team raised enough money to keep Syndetic going, and Steve took on consulting work, mainly keeping software up to date for companies with technical debt. The idea of taking a data-centric view of open source dependency maintenance grew out of that experience.

Their third co-founder, Andrew Lenehan, was previously a product manager at AppNexus, now part of Microsoft, and co-founded Roster, which became Punchcard, a data exploration tool for revenue teams.

Allison Pike said using data to improve upgrade management only makes sense.

“It’s an interesting use of AI because a lot of what’s been written about AI recently, or really since ChatGPT came out, has been about applying AI to code itself. Changelogs are really interesting in that they are a document that’s written by a human, the maintainer who’s maintaining the open source project, but it is code. So it’s in the GitHub repo, but it’s a text document that was written by a person. And so you can go into more detail.”

All that information goes into a large language model (LLM) used to define the optimal strategy, which the company calls Upgrade Path, to get from Point A to Point B in an update strategy. That can save months of time in a large project like keeping everything in Ruby on Rails up to date, according to the company.

First, you connect the Infield web app to your codebase in GitHub, it scans your code to determine the underlying dependencies, and then the technology recommends the steps needed to upgrade safely for your codebase. That might involve updating a subcomponent to an intermediate version rather than the latest version to avoid breakage downstream.

“Once you’ve got this backlog of 100, let’s say candidate upgrades you could do, you can use our data to prioritize them,” Steve Pike said. “So we show you information about both risk — what are you exposed to by not upgrading this dependency? — and effort. How much work is there going to be involved in doing the upgrade? Are there breaking changes or are there other packages in your in your project that need to be upgraded first that are blocking this upgrade?

“So you can kind of like run filters to slice those two things against each other and find like, I can clear out a dozen stale dependencies without hitting any breaking changes. So maybe I can do these in one pull request as long as my tests pass. But here’s other high-risk stuff where actually there are major breaking changes. So this needs to be more of a project.”

Allison Pike maintains that existing solutions like Snyk tend to be focused on uncovering security vulnerabilities or, like Dependabot, leave users a task list, but without contextual information customized for their specific system. I wrote about a European firm, Depfu, a few years ago that uses automation and sends no more than seven tasks at a time to keep users from being overwhelmed.

Fully automated dependency management has its detractors, including consultant Gerald Benischke who makes a case against it in this blog post. Infield takes a more human-assisted approach.

While Infield detects breaking changes automatically, relying on the linters and package managers of each language or framework, it does not automate the actual updates. If a code change is required, the user can do it or rely on Infield’s managed service to do it. Originally aimed at Ruby on Rails, it recently added support for JavaScript/TypeScript and Python. TypeScript and JavaScript share the same package manager.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.