As software engineers take the time — or struggle to find it — to digest yet another heaping truckload of new ideas, information, and maybe inspiration from the latest KubeCon + CloudNativeCon in Copenhagen, a sober, respectful, and revealing new dialogue is emerging about the state of Kubernetes and its ecosystem.
The discussion began as a Hacker News thread that ostensibly asked whether Kubernetes introduces more complexity into an organization than it removes. Actually, there were not that many responses, and the bulk of what was shared there was mainly reassurance that any complexity introduced by, or by way of, the orchestrator would certainly be compensated for by efficiency in scheduling and execution.
The dialogue then moved to Twitter, where Google Compute Engine co-creator and now Heptio chief technology officer Joe Beda made the effort to isolate and identify just where the orchestrator’s pain points might be. Beda did assert that such an organically constructed component, along with the services surrounding it, is difficult to introduce to newcomers. Yet he did continue the general theme: Kubernetes creates a new world of staging that can be bewildering at first, though which is ultimately beneficial.
“Kubernetes provides a set of abstractions that solve a common set of problems,” Beda wrote. “As people build understanding and skills around those problems they are more productive in more situations. There is still a steep learning curve! But that skill set is now valuable and portable between environments, projects and jobs… The story of computing is creating abstractions. Things that feel awkward at first become the new norm. Higher layers aren’t simpler, but rather better suited to different tasks.
“I think that, as engineers,” Beda continued, “we tend to discount the complexity we build ourselves vs. complexity we need to learn.”
I’d Like to But I Can’t, Hokay
Up to that point, the topic had not been all that much of a dialogue. But then it was joined by Datadog software engineer Jason Moiron, who steered it toward his own personal blog, and a post which raised a very relevant issue: Does Kubernetes create complexity through its need for explicitness, mainly to make things easier for itself? And in so doing, does it create a situation where the context of the applications it manages becomes far less scalable than their deployments? Put another way, are you stuck with what you’ve built?
“We’ve used service discovery along with health checks and self-diagnostics to be able to do some fairly interesting things during outages and, more importantly, slowdowns,” wrote Moiron. “These are hard problems, and we’ve solved them rather crudely in places, but these solutions still give us a level of proven, necessary sophistication for system stability.
“Unfortunately, our approach is slightly incompatible with Kubernetes’ centralized one. Usually, when this happens, your needs are just too sophisticated, but in this case Kubernetes’ approach is already explicitly complex in order to try to deal at the proper level of sophistication. It’s a thoughtful and mature approach, but its structure is just inverted from ours. It’s complex in incompatible ways, the worst of both worlds.”
Service discovery is, essentially, the system whereby an application or system may leverage other services, such as DNS, to get connected to the specific services it needs, when those services are not architected directly into the system. The reason Microsoft Windows always slows down over time is that its solution to the service discovery problem was simple but gargantuan: a colossal System Registry database, like a Yellow Pages for a city populated by tribbles.
When two Windows applications, separated by distance, needed to share functionality, they had to have installed in their respective, local editions of the Registry the same “type library” — the same page ripped out of the same phone book. This would guarantee that the function being referred to by the sender was the same one as the function being looked up by the receiver. The danger was that the newer edition of an application might not be able to place a remote procedure call (RPC) to an older edition of the same application. In short, the only way to scale was to scale together.
Windows’ old solution made obvious the problem of distributed systems: They work best when the system being distributed is replicated as exactly as possible. Change the system, and who knows whether you’ll ever be able to locate the service you’re looking for, when the application is installed someplace other than your own system.
DNS gave Web services a way to enable the receiving system containing the requested service to be dynamic and flexible, and for its own applications to scale on their own time. Maybe an RPC would work and maybe it wouldn’t, but the result would never be catastrophic. t could be handled; there were options.
Moiron’s Complaint — as it may come to be known — raises the issue of whether Kubernetes’ (and thus Google’s) solution to the service discovery issue applies only to a Google context. It’s the Yellow Pages-for-tribbles problem, just at another level: By centralizing services around its own context, does Kubernetes ensure — the same way Windows and Windows Server ensured — that the systems being orchestrated cannot evolve? Sure, they can get bigger or smaller, but that’s not the point: Can they grow?
Writing in SMS-speak, Moiron commented, “I’d like to run another process oh I cant unless I sidecar it and intimately describe its every relation with the parent’s environment via yaml hokay.”
“Kubernetes was designed by systems engineers, for systems engineers,” stated Kate Kuchin, an engineer with Heptio, during the last KubeCon. “Which is great, if you’re a systems engineer. For the rest of us, Kubernetes is really, really intimidating. With the exception of those people who created Kubernetes who were there at the very, very beginning, everyone in this room was probably a new Kubernetes user at some point, or is a new Kubernetes user now, or will be a new Kubernetes user next week. So you all already know that it can be pretty daunting.”
She learned about how the orchestrator works, Kuchin told her audience, from her boss — Joe Beda himself. And while his presence and experience inspired her to pursue mastery of the system for herself, she admitted that right away, she became competent enough to explain Kubernetes’ basic concepts thoroughly without having too much of a clue about what they meant or what they did. (If I were to admit I empathized with her situation, I would reveal way too much about my job as a technology journalist.)
Kuchin noted that observation in the context of introducing ideas for how new user environments (UX) could make the whole Kubernetes experience simpler for the everyday enterprise. The notion that Kubernetes is hard is, arguably, a pillar of the ecosystem that surrounds it. Since its inception, most every vendor offering Kubernetes as a service has framed itself as a simplifier of the orchestrator. If it were easy to grasp, not only would we not be having this conversation, but we might not have the ecosystem we currently have.
The Good, the Bad, and the Undigested
Curiously, one of Jason Moiron’s most pointed grievances concerns the use of YAML for expressing configuration intent. Borrowing a bit of SMS-speak, he asks in his blog post, “How can I even what is all this yaml?”
Citing one of Yale professor Alan J. Perlis’ famous “Epigrams in Programming,” “A programming language is low level when its programs require attention to the irrelevant,” Moiron conceded that Kubernetes’ well-known “opinionated” approach to deploying distributed systems does coincide well with what a systems analyst may expect, and helps people who aren’t distributed systems analysts with making the first decisions in that direction. In that sense, the orchestrator may not be too low-level for some folks who are at the opposite end of the distributed systems scale from Google.
But for engineers who have already made service discovery decisions for their own systems, Moiron says, the decisions that the orchestrator would make on its own would tend to be architecturally incompatible with those decisions.
“Usually, when this happens,” wrote Moiron, “your needs are just too sophisticated, but in this case Kubernetes’ approach is already explicitly complex in order to try to deal at the proper level of sophistication. It’s a thoughtful and mature approach, but its structure is just inverted from ours. It’s complex in incompatible ways, the worst of both worlds.”
Heptio has made almost this exact argument. Its premier tool, called ksonnet, is a kind of intermediate interpreter that uses a form of JSON called Jsonnet to enable a more declarative set of configuration files — often, more than one combined together. The product of that combination is a single YAML file digestible by Kubernetes, although the intermediary permits a degree of flexibility and extensibility that YAML might not permit on its own.
“I’m not saying that I have this magical tool here, that can actually solve all your problems with a generator,” explained Heptio’s Bryan Liles, during another KubeCon session explaining how to use ksonnet, “because that’s silly. I want easy things to be easy, hard things to be possible. And what we’re doing here is, we’re trying to remove the need for YAML from your day-to-day. Not get rid of it, but remove some of the need.”
In a new Twitter thread that responded to Moiron’s post, Heptio’s Joe Beda essentially validated every one of Jason Moiron’s complaints. “When we started with YAML we never intended it to be the user-facing solution,” Beda wrote. “We saw it as ‘assembly code.’ I’m horrified that we are still interacting with it directly. That is a failure. This is not an easy problem and I don’t think there is a silver bullet. By having a raw form we do enable an ecosystem of solutions. Those are still early and are a part of that raw chaotic primordial soup that is parts of Kubernetes at its current stage.
“We run the risk of solving problems by introducing even more complexity,” the Heptio CTO continued. “I worry about this with efforts like Istio. It can do amazing things but is both early and yet more complexity piled on top. We have a limited capacity to absorb this stuff and k8s isn’t digested yet.”
KubeCon + CloudNativeCon is a sponsor of The New Stack.
Feature image: Construction of the Thames Tunnel, circa 1830, by an unknown artist, in the public domain.