Netflix’s Open Source Orchestrator, Conductor, May Prove the Limits of Ordinary Scalability
How many times has this happened to you? You’re minding your own business one day, running your international delivery system for high-definition dramatic and comedic video with 16-channel stereo sound on tens of thousands of concurrent channels. Your content suppliers in Tokyo or Copenhagen or wherever, just like every other day, are flooding you with raw or semi-edited video, and are seeking your input on possible edits and scene changes. They’re using an international standard video delivery format created just for this purpose, and maybe you’re the only customer for this format, but that’s never bothered you before.
Suddenly you have three or four hit shows, and it’s a holiday, and everybody’s binge-watching them at one time. Your suppliers are hitting you with requests they need to have fulfilled so they can get their series published in January. There’s a traffic spike. Your viewers’ resolutions have to be downshifted below 4K high-def just to compensate. There’s no workflow established in advance for this contingency. Little children all over the world notice they’re being downgraded to 1080p and start crying. Christmas is just a few days away. What will you do? What… Will… You… Do?
It’s one thing to imagine your data center running a little bit, or a lot, like Google. It’s another thing entirely to imagine it working like Netflix. Nevertheless, for anyone who is so inclined, Netflix released into open source last month the orchestration engine its engineers developed for running its entire operation in the cloud — specifically, on Amazon’s cloud. It could very well be the engine for the world’s single most sophisticated distributed processing system.
So the question, for our sake, becomes, does it scale down? More to the point, if Netflix’ good ideas remain just as good on a smaller scale, then does Conductor hold an advantage over little ol’ Kubernetes open source container orchestration engine to which we need to be paying attention?
Real Orchestration Doesn’t Exist Yet
“With peer-to-peer task choreography,” wrote Netflix architects Viren Baraiya and Vikram Singh last month for their company’s engineering blog, “we found it was harder to scale with growing business needs and complexities.”
We’ve talked about microservices here in the past as a way to achieve efficiencies by breaking down services into their most minimal components and opening up communications channels between them. At a certain scale, making things indeterminately smaller creates an inordinate amount of crosstalk between them. At a Netflix scale, it may be the only thing more difficult to orchestrate than the current U.S. presidential transition.
In a November 2016 white paper for the IEEE [PDF], a team of six researchers, including IBM Zurich staff researcher Lydia Y. Chen presented as complete an inventory as they could of the number of very-large-scale microservices scheduling platforms utilizing common container-based and hypervisor-based technologies. Their count was exactly zero.
“Despite the clear technological advances in container and hypervisor-based virtualization technologies,” the team wrote, “we are yet to realize a standard large-scale, performance-optimized scheduling platform for managing an ecosystem of microservices networked together to create a specialized application stack, such as a multi-tier Web application and Internet of Things (IoT) application.”
Reading The New Stack over the last few years, one might have thought this goal had already been achieved. But as this team concluded, modern container orchestration systems rely upon the capability of configuration management systems to apply the necessary customizations to every node that runs services. And on a microservices-scale, not even a calculator, they wrote, contains the digits necessary to project the number of possibilities. “We, therefore, need new research,” they wrote, “that focuses on developing techniques for accurately modeling, representing, and querying configurations of microservices and data center resources.”
Or, you can adopt an architecture that is far less reliant upon “bespoke configurations.” This is what Netflix Conductor, and the architecture that supports it, apparently achieve: Conductor relies on the idempotence of services, in addition to their statelessness. That is to say, any service in the scheme is trusted to achieve exactly the same results, in exactly the same way. There is an air of serverless-ness to this ideal: At a fundamental level, microservices that perform the same function are treated as one function.
This being the case, there’s much less chatter that such a function needs to actually communicate, in the process of performing its job. As long as the workflow for the job is explicitly defined, and the result of that job is either “Here’s what you asked for,” “I did it,” or “Oops,” the sorts of things that render typical workflows impossible at huge scale, never come into play.
There is still communication. But since tasks are broken down so granularly, the jobs to which they contribute end up being easier to monitor, so the relative state of progress can be directly ascertained. Instead of polling the application with a request that’s the equivalent of, “How you doin’?,” the component responsible for maintaining workflow — which Netflix has dubbed “Decider” — assesses the collective progress of each individual service, and from there determines the state of the application for any moment in time.
Trust, But Verify
Here is where the real debate begins. Whether we like it or not, microservices are the latest culmination of service-oriented architecture (SOA). This means that, eventually, some of the unresolved arguments from the 1990s will rear their ugly heads (or head their ugly rears) today. One of them deals with delegation. Specifically, how much should an orchestrator trust the services that constitute a process, to manage that process for themselves? The more trustworthy a process is, the less burden they convey to the orchestrator during a high-traffic situation.
At the turn of the century, well before Linux containers or forms of them were considered as receptacles for business processes, the orchestration argument centered around the nature of business transactions. One objective of SOA architects was to derive a concept of a transaction that was Atomic, Consistent, Isolated, and Durable (ACID) — an indivisible unit. From their perspective, a computing function could map to a business transaction. So you could conceivably develop an API around these transactions, using terms and techniques that would map to how a business person would understand the transactions.
Here was the problem: Transactions take place in chains. Maybe each link was indivisible, but the next link was dependent upon the same data source that the previous link used and probably modified.
In 2001, a research team including members of HP and Compaq’s Tandem division (which were on the verge of merging) produced a white paper [PDF] that explored the problem of mapping business transactions to computer processes, introducing people to the competing approaches to finding a solution at that time.
“In particular, the properties of atomicity and isolation could be usefully applied to business processes (or at least to parts of a business process),” the HP/Compaq team wrote. “For example, one might want to declare that the payment of an invoice and crediting the payee’s account were part of an atomic unit of work.
“However, it was also recognized that it would be overkill to treat an entire business process as a single ACID transaction. First, since business processes typically are of long duration, treating an entire process as a transaction would require locking resources for long periods of time. Second, since business processes typically involve many independent database and application systems, enforcing ACID properties across the entire process would require expensive coordination among these systems. Third, since business processes almost always have external effects, guaranteeing atomicity using conventional transactional rollback mechanisms is infeasible, and may not even be semantically desirable. It became apparent that the existing database transaction models would have to be extended.”
The HP/Compaq team explored various forms of messaging, including the publish/subscribe model, as ways for tasks in a business process chain to “pass the baton,” if you will, between each other. This way, the next link in the chain would have access to the data the previous link used, and would only gain that access once the previous link was finished.
The evidence they presented at this stage revealed that you really couldn’t directly map business transactions, like chunks on a flowchart, to computer processes, without creating an inordinate amount of work for whatever was orchestrating them. Thus began a trend toward loose coupling, with the acceptance that a process that appears indivisible on a business level should be composable on a computer level.
On Second Thought, Don’t Trust, But Watch Very Closely
Publish/subscribe appeared to be suitable, partly due to its use of topics as a way of maintaining context for the various links in the chain. At the start of this decade, Sattanathan Subramanian of Norway’s Uni Research laboratory proposed a concept [PDF] called Data-Flow Delegation (DFD), as a way for orchestrators to instruct services in a chain how to participate in a workflow, and how to exchange the data upon which these services would be mutually dependent. DFD uses a concept that sounds like SDN: separating the control plane from the data plane, “in order to delegate the data-flow responsibilities to the component services.”
In DFD, there’s a clear set of instructions designating where the data comes from, and where it’s going. Services refer to those instructions and work from there. It would appear to be a workable solution to the cross-talk problem between microservices — one that would be applicable to an orchestrator like Kubernetes, paired with a messaging system like Kafka.
But ironically, Netflix found the components of this concept too binding — in other words, not decoupled enough. Wrote Netflix’ Viren Baraiya and Vikram Singh last month, “Pub/sub model worked for simplest of the flows, but quickly highlighted some of the issues associated with the approach.” In their list, they said process flows tend to be “embedded” within applications, and that the assumptions around their workflow, such as SLAs, tend to be tightly coupled and non-adaptable. Finally, there wasn’t an easy way to monitor their progress.
So Netflix’ approach with Conductor takes the next step, not unlike breaking down subatomic particles into quarks. Conductor seeks atomicity and idempotence at this very tiny scale, delegating absolutely nothing to these processes except an expectation of completion. The state of the workflow is determined centrally, by Conductor’s Decider, by comparing each worker’s progress against what it calls the “blueprint.” While researchers like Subramanian might suggest that such an approach opens up a potential single point of failure, Singh and Baraiya note that this centralization opens up the avenue of a single UI for visualizing the real-time state of the system — something that’s effectively impossible under a delegated approach.
Netflix Conductor is the latest evidence that, for computing processes to be more effective and viable at huge scales, they have to behave less and less like anything a rational businessperson would expect. In applying logic to the latest incarnation of the model that sought to prove that services are like people, Netflix is proving the opposite.