Mesosphere’s ‘Container 2.0’ Unites Stateless and Stateful Workloads
The argument over the viability of stateful container-based applications versus stateless ones is long settled. In real-world multi-tenant production environments, applications need access to persistent data stores and volumes. It’s ridiculous to make developers jump through hoops — even open-source, device-agnostic, standardized hoops — so that they can send messages or record entries in a key/value store or a log file.
Mesosphere has a worked out a way to manage both stateful and stateless container workloads, along with workloads not even using containers, all on the same shared infrastructure, using DC/OS (Mesosphere’s Data Center Operating System) — both the commercial and open source editions.
The trick is to allow some distributed programs handle their own scheduling. Container orchestrators, such as Kubernetes and the Docker Engine, use a single “monolithic,” scheduler, noted Florian Leibert, Mesosphere’s CEO, in a blog post. “Because there is no single scheduler that can optimize for all workloads, users end up with non-optimal operating constraints, including being forced to create separate clusters for each service,” he wrote.
To this end, Mesosphere has partnered with some companies, allowing them to connect their own schedulers to DC/OS, including Confluent — which manages Apache Kafka — and Datastax, which manages Apache Cassandra. The company also expanded an existing partnership with Lightbend to extend the Lightbend microservices scheduler to work with DCOS.
The company began promoting this style of orchestration — where workloads communicate with persistent volumes without the use of plug-ins, as “Container 2.0.”
“Running multiple schedulers on the same cluster — simultaneously, multi-tenant on shared nodes — is the only way to maximize resource utilization and accommodate the wide range of Container 2.0 workloads,” Leibert wrote.
Rev the Odometer
There’s nothing inherently incompatible between Apache Mesos and Apache Kafka; open source projects to host the latter on the former have existed since the dark ages (2014). But in a containerized environment, there isn’t much point in establishing a stateful message stream or a commit log if the applications within the containers don’t perceive it as a consistent entity.
Confluent recently released the 3.0 commercial edition of its implementation of Kafka 0.10. Confluent’s goal is for its Enterprise 3.0 platform to serve as a central repository for real-time data streams accessible through by any manner of client. With that release, Confluent and Mesosphere jointly announced that 3.0 would be supported
Kafka could very well hold the key to a containerized environment interacting seamlessly with streaming analytics components such as Apache Spark, and a row-store model database such as a Cassandra cluster, without the need to re-engineer the container engine, and also without the prerequisite of building some strange containment system for these other components.
“We have all these systems — Cassandra, Spark, and so forth — that all make up DC/OS, and that go beyond just scheduling containers,” said Leibert, in a follow-up interview with The New Stack. “They’re really something that extends that Container 1.0 concept.”
But as Leibert explained to us, the meaning behind Mesosphere’s effort to rotate the proverbial odometer wheel this early is to symbolize a change of mindset.
With analytics and messaging in a standard configuration management environment, he pointed out, only one Kafka instance is recognized at a time. Scaling the instance out required reconfiguring it, which was an exercise in scripting. With Confluent Enterprise in DC/OS, he said, scaling up or down a Kafka instance is reduced to a command.
So if it’s all just one command now just like any other DC/OS command, we asked Leibert, then can’t this command be automated — or, more specifically, included as part of a script or policy that automatically responds to an event or a condition?
Since Kafka is capable of reporting its own capacity at any time, the CEO answered, a script may indeed be utilized to send an appropriate scale up command; or, an alert can be sent to an operator who can perform the command manually — both without incurring downtime. He told the story of his experiences at AirBnB, where scaling up Kafka for a server meant taking that server down.
“Whenever we wanted to upgrade it, we had to stop it, make the configuration changes, and start it back up,” he said. “Oftentimes that meant that logs could have potentially been lost, and we had to undergo a lot of engineering afterwards to make sure that these logs weren’t lost. What was even more tricky was, if you wanted to add another broker beforehand, or add capacity to Kafka, what you had to do was go into pretty much every node and change the configuration. With us, you can do this change in-process, while it’s running. We already have the nodes provisioned, so we can just dynamically start something new there.”
Before DC/OS, he said, a 10-node Kafka cluster was a physical thing, to which you couldn’t add another node without racking in a new machine, installing Linux, installing Kafka, and then altering the configurations of the 10 existing nodes. At the beginning of the containerization era, these node configuration processes became more automatic, but the problem was, they still existed.
Now, adding a node is a matter of installing DC/OS on that node (physical or virtual), and enabling the system to absorb it into the existing framework without manual reconfiguration. Kafka is already engineered, Leibert explained, to accept the redistribution of streams among brokers when a new one is added to the cluster, simply by changing their stream IDs. So stream workloads may be redistributed as part of the node addition process, also without downtime.
Mesosphere is working toward an orchestration framework where the format of the container is less important to the management of the system.
Leibert told us another story concerning one of his customers: a major global telco, whose shared infrastructure runs Cassandra, Kafka, and Mesosphere Marathon on thousands of nodes. In a similar sense to “serverless architecture,” this telco’s ability to upgrade its Kafka clusters on the fly without downtime may be considered “container-less architecture.” Yes, there are containers, just as there are servers. But who cares?
“And they’re running a consumer service on the side,” explained Leibert, “that entails everything from serving Web traffic to collecting the logs, piping these log files through Kafka, then doing Spark analysis on them, and then saving the rollups of the analysis in a Cassandra database.”
State of the Union
The leap of faith (and perhaps, in turn, of logic) that Leibert would like us to take, however, is that the nature of how containers are orchestrated, in a system that simultaneously maintains persistent streams and databases, somehow changes the nature (if not the format) of what containers actually are.
“We’re trying to make sure everybody understands that containers themselves, and the way that they’re orchestrated today by most other systems — Container 1.0 — are really only about stateless applications,” he said. “DC/OS can do much more than that — it’s much more than just stateless containers. It’s stateful workloads that all live together on the same shared infrastructure.”
Indeed, Mesosphere is working toward an orchestration framework where the format of the container is less important to the management of the system. In a recent interview with the company’s founder and chief architect, Ben Hindman, he reminded us that Mesosphere is slated to support Windows containers come in September when Microsoft is scheduled to release Windows Server 2016.
In adopting Windows containers, and workloads made for Microsoft’s Hyper-V hypervisor, Hindman said, it becomes easier for contributors to the new Apache Mesos 1.0 to add support for different kinds of workload packages. And by that, he didn’t mean just the newer ones.
“I wouldn’t be too surprised if, in 2016, we add to the [unified] containerizer the ability to run VMs,” Hindman told The New Stack. “Think of it as an image format, just like you have Docker images as an image format and appc images. I can imagine we’ll also get something like VMs. I think that’s probably how you’ll see this stuff evolve for Mesos.”
There are not all that many months remaining in 2016, so it’s doubtful that Hindman would have raised that possibility if someone in the project wasn’t already pretty far along with it.
A DC/OS, coupled with Marathon, which could simultaneously orchestrate hypervisor-coupled workloads along with containerized ones, would truly alter the value proposition for Mesos that had been bandied about at this time last year. Although Mesos preceded the rise of Docker, in 2015, Mesos was being talked about as the alternative to virtual machines, by how Mesos treated workloads as “elastic.” Some touted Mesos as the harbinger of a cultural revolution, in the vein of Franz Kafka himself, breaking developers free at last, free at last from the bindings of barbarous bureaucracies, casting VMware aside and bidding it adieu.
Let’s not forget, Hindman’s design for Mesos has always been based on a two-level scheduler. That second level has always held out the promise of providing an abstraction layer between any kind of packaged workload, and the manager of resources.
“The two-level scheduler concept allows you to give a certain priority to a certain framework — and that’s how Twitter and a lot of other companies are running their data centers today, in this multi-tenant way,” explained Florian Leibert. “You give Kafka a certain priority, and you might give Spark a certain priority or Hadoop MapReduce. And once it approaches a certain threshold — for example, 90% utilization — it won’t allow you to run any Spark jobs until the utilization drops, in order to ensure that Kafka can continue to run.”
Hindman’s suggestion opens up the very real possibility that Mesos could provide elasticity for VM-based workloads in the same way. If that’s the case, then there’s a viable argument for “Container 2.0” as a new way of thinking about workload containment and partitioning — specifically, as a way to think less about it.
Mesosphere is a sponsor of The New Stack.
Feature image: Kinetic sculpture of Franz Kafka’s head, in a town plaza in Prague, by David Černý, licensed through Wikimedia Commons.