Last month, Wojtek Borowicz, a community manager at sensor-based analytics platform Estimote, turned his thoughts to a timely theme: modern software complexity, and where it’s coming from. On Medium he’s posted a new series of eight interviews with prominent developers, each a specialist in a different way, “to explore and explain why software is more complicated than it looks…”
“Some of those reasons are technical, while others stem from human fallibility and how the discipline of software development bears the consequences of these failings,” Borowicz wrote. Covering everything from dependencies and cryptography to containers and monitoring systems, each post weighed in on another piece in the chain of complexity that’s crept into our modern infrastructure.
Borowicz begins his exploration with operations and database engineer Charity Majors, co-founder and chief technology officer of Honeycomb, the company behind an observability tool for “introspecting and interrogating” production distributed systems. Majors point out “we’re in the middle of this great shift, from the monolith to microservices or from one to many. You’re depending on all those loosely coupled, far-flung services that aren’t even yours, yet you’re still responsible for your availability.”
“We’ve sliced and diced and spread everything around that now there is a hundred points of failure instead of just one” — Honeycomb.io’s Charity Majors.
This interview tackled “the thorny questions around why software breaks and how engineers approach fixing bugs and outages,” and begins with a vivid introduction: “The first minutes of an emerging outage are frantic. Alerts start pinging, number of support tickets goes through the roof, engineers and customer service scramble to assemble the response team” The post observes that “There is no other industry so accustomed to its products breaking that it’s considered part of the daily routine.”
Majors notes that some risks sneak past our testing because of unpredicted interactions between multiple systems. “The problem is usually like half a dozen impossible things combined,” she said. “It’s mind-boggling to sit back and actually think about how complex these systems are. What amazes me is not that things fail but that, more or less, things work.”
Majors describe one problem which was ultimately traced to a bug which only affected one router in eastern Romania. “You can’t predict this stuff. You shouldn’t even try. You should just have the instruments and be good at debugging your system. That’s all you can do.”
When Borowicz probes just how often code is changed, Majors says “your system is built on top of another system, built on top of another system… so it could be someone introducing a change you have no visibility into and no control over. Timing doesn’t matter. You should just assume that changes are happening literally all the time. That’s the only way to plan for risk…”
Later Majors goes even further. “You should build it with the assumption that it’s failing all the time and that’s mostly fine. Instead of getting too hung up on failures, we need to define SLOs — Service Level Objectives… This is really liberating. It’s a number we’ve all agreed on, so it has the potential to ease a lot of frustrations that many teams have had for years and years.”
Did We Choose Complexity?
Borowicz gets a different response from David Heinemeier Hansson, creator of Ruby on Rails and co-founder of the Basecamp project management software. Hansson argues “You don’t have to let complexity roll over you. You choose to.”
“If you are unable to contain the complexity of a monolithic application, what on earth makes you think you are competent to distribute that complexity across a fleet of services that now have to interact and deal with network unreliability, retries, two-phase commits, and all these other complexities that simply do not exist when you deal with method calls, parameters, and the basics of running a singular process. There’s very little worse harm you can do to an application on the complexity level than distribute it. You can take even the smallest problem and as soon as you distribute it, it grows an order of magnitude in complexity.”
Hansson gives the eight-part series a philosophical finish, in a rollicking interview challenging which inherent complexities are — and aren’t — baked into our current software development methodologies. “If you were to summarize the entire endeavor of software development, you’d say: ‘The project ran late and it got canceled’. Planning work doesn’t work, so to speak.”
“Until we figure a way to beat the speed of light, the best thing you can do is make sure the assets are as close to the end user as possible.” — Cloudflare’s Rita Kozlov.
Hansson remains skeptical of estimates and modern implementations of Agile development. “The project definition that is vague is actually more realistic,” he said. “Highly specific project definitions usually go astray very quickly. Vague enough definitions allow for creativity and selectivity for the people doing the work.” But he seems to suggest some complexity could be reduced by carefully and thoughtfully examining the whole software development process.
“There is a mythology of the 10x programmer. But that’s not the programmer who is heroic in their implementation of the problem. The 10x programmer is the programmer who restates the problem.
But it seems like complexity is everywhere. Borowicz discusses networking with Cloudflare product manager Rita Kozlov. Kozlov explains several approaches to approaching a robust infrastructure including multi-cloud solutions and primary and secondary DNS providers. “But ultimately it’s just a bunch of wires hacked together. People are writing code that has bugs, as anyone’s code does. Sometimes there is not much you can do about outages.”
There’s also a discussion about our peripherals and the challenges of hardware integration with Linux kernel maintainer Greg Kroah-Hartman who ultimately explains that in some sense it’s software all the way down. “It is very rare that any peripheral made in the past 10 years would not have a processor running software written for it,” the kernel maintainer said in his interview.
No Finish Line
Security software engineer Anastasiia Voitova weighs in on the issues of security and cryptography, warning that “Unfortunately, there is no finish line here. There’s no sign that says ‘hello, you’ve done everything and are 100% secure’.”
Voitova points out we’ve at least attained some awareness of where our complexity is coming from. “There are companies whose main business it is to keep an eye on dependencies.” But at the same time, we’ve also learned there’s limits to our modern infrastructure. “Nothing is impenetrable. You can protect your application from the most common threats but you cannot protect it against vulnerabilities that will be revealed, say, next month.”
The series continues its panoramic tour of modern software development with a discussion of performant software engineering with Jeff Fritz, a program manager for Microsoft’s web development framework ASP.NET, a conversation which eventually leads them to virtual machines and containers. Borowicz also interviews Sina Bahram, president of an accessibility design firm. (“It needs to be woven into the entire product development lifecycle, from conception until post-production and maintenance.”)
Borowicz even interviews Bianca Berning, an engineer/designer/creative director at a type design studio named Dalton Maag. (“There are writing systems, such as Arabic and Devanagari, in which letter shapes vary depending on the context in which they appear.”)
In the end, it all seems to prove the simple but irrefutable title for Borowicz’s series of interviews. At the end of the day — computers are hard. At the very least we should have some sympathy for the developers we expect to deliver precise estimates of development times, Hansson argued.
“Software, in most cases, is inherently unpredictable, unknowable, and unshaped. It’s almost like a gas.”
- Engineering Manager for Ubuntu: No, Microsoft is not rebasing Windows to Linux.
- Legendary videogame creator Sid Meier publishes his memoir.
- The Commodore 64 game hidden in a 1984 vinyl album.
- Microsoft devs answer the question: which is better, spaces or tabs?
- Winners announced in annual Python game-making jam.
- The day a bot started posting on Reddit.
- National Novel-Generating Month has begun!
Honeycomb is a sponsor of The New Stack.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Honeycomb.io.