Mesosphere’s Data Center Operating System Now Includes a Scheduler and Orchestrator

One of the guiding principles of open source software, especially in the infrastructure space, is the “batteries included but swappable” ethic that maintains that users and implementers should always have a clear choice of pluggable components. It’s part of the “agnosticism” that defines many of the companies in The New Stack space — for example, Mantl, whose entire use case is predicated on flexibility and interchangeability.
In 2014, Docker Inc. was being praised for its revolutionary approach to system infrastructure, which at the time stood in stark contrast to VMware’s fortress mentality around hypervisor-driven virtual machines. “I don’t think there’s any danger,” wrote one contributor to Hacker News at the time, “as long as everything is open and interchangeable, and you can steer clear of Docker (the company) even if you like Docker (the format).”
‘Default’ or ‘Other’
Today, as VMware allows its technology to be infused with Docker, and Docker has bundled its own orchestration manager into its core container runtime engine. Although welding its Swarm orchestrator onto Docker version 1.12 does not preclude implementers from using Kubernetes, Rancher, or Marathon instead, having Swarm right there on-hand makes choosing Swarm arguably — and as was posited here in these pages — “a no-brainer.”
While the debate over Docker’s creativity was going on, Mesosphere made a similar move. For version 1.8 of the data Center Operating System (DC/OS), Mesosphere opted to graft its Marathon orchestrator directly onto the system, exposing it as “DC/OS Services.”
“The result,” wrote Mesosphere’s Derrick Harris in a company blog post, “is that users can access service management and container orchestration (powered by Marathon at the backend) directly from the main DC/OS dashboard — via the same DC/OS UI they know and love.”
“We always had a Marathon shipping with DC/OS,” explained Mesosphere founder and Mesos co-creator Ben Hindman, speaking with The New Stack. “What we really did here was, we said, not only do we want to have a default Marathon, but we wanted the default DC/OS control plane to have that Marathon integrated in from an API aspect, from a UI and CLI perspective. That being said, we also wanted to make sure that we could have something swap in and speak any of the other APIs that they wanted to, and they could run other Marathons or Aurora or any of the other schedulers that you can run today.
“I think that there is a distinction from what happened with Docker with Swarm,” Hindman continued. “It’s not like we didn’t have that stuff shipping before, and we also didn’t bake it in so you couldn’t run a different scheduler. We just made it so that there was a UI component and a CLI component for DC/OS that would natively speak directly to what we call the root scheduler in Marathon.”
The Switch to Avoid a Switch
Hindman explained that, in previous editions of DC/OS, its UI gave users one-click access to Marathon. That could be considered convenient if you liked flipping back and forth between things. With version 1.8, tasks that are being managed by the root scheduler — whatever that may be — are exposed in the UI without the one-click.
“To be perfectly honest, I think we’ll do it for all the frameworks and schedulers that you run on top as well,” he continued. “Because we have all that information from Mesos, we can pretty cleanly integrate that information. It’s like when web companies do analyses, and they find it took five clicks to buy a product instead of two clicks, or five clicks to get to the content instead of two. Is there something we can do to make it two clicks?”
The Marathon and Metronome frameworks, with version 1.8, become DC/OS Services and DC/OS Jobs, respectively. Metronome was developed as a scheduling library particularly for “one-off tasks,” or truly ephemeral containers that can be allowed to expire; by comparison, typical orchestration would have such a task be re-launched. While Metronome has been compared with Chronos (which is another cron scheduler for Mesos), the Metronome scheduler has had the virtue of sharing the same code base as Marathon.
For example, Hindman offered an analytics pipeline, featuring a concurrent series of tasks that are dependent upon one another. If one task should fail, the rest of the pipeline depending upon that task should be discontinued rather than re-launched. While Apache Chronos could handle that manner of scheduling, he said Metronome can take advantage of security features being built into the DC/OS platform.
“‘Scheduler’ is such a tough word for me to use sometimes,” Hindman admitted. “When I was at Berkeley, and we used this word ‘scheduler,’ we were just academics and it wasn’t a big deal to use that word. These days, people are like, ‘Oh, a scheduler, like Kafka! That’s crazy, why would you need Kafka for?’ Which is one of the reasons why we use the word ‘framework’ more and more.”
Access control, authentication, and authorization are among the security factors Hindman singled out as critical to DC/OS’ evolution. The passing of secrets to containers at launch (for instance, non-public parameters) has historically been an issue. Some developers have actually been re-building entire container images, with the secrets embossed into the file system as documents — a dangerous situation if the registry should be compromised, and the file system hacked into. DC/OS 1.8 uses SSL, Hindman noted, to encrypt all communications between the platform components, although it will continue to be up to developers how or whether to encrypt communications between containers and each other.
It remains Mesosphere’s goal, he said, to enable any or all schedulers / frameworks to run unencumbered on the DC/OS platform, including Kubernetes. But the environment in which DC/OS’ newly built-in services will run will appear (at least to the user of the UI) more seamless.
Avoiding a Completely Different Switch
When Ben Hindman spoke to us a few months ago, he discussed how plug-ins as a component of a container ecosystem could become irrelevant. He painted a verbal picture of how an orchestrator can enable networking between containers in such a way that the container engine would not need to be jury-rigged to facilitate it.
Theoretically, a plug-in is one type of component that could be artfully, perhaps seamlessly, “included,” to borrow a phrase, with a container engine or an orchestrator. That is unless such an inclusion would appear self-serving.
DC/OS 1.8 now assigns a single IP address to a container. That address may be assigned by way of an IP Address Manager (IPAM), or through a VxLAN — but a VxLAN is not the only way. Alternately, DC/OS enables multiple nodes to be addressed as a service, by way of a virtual IP (VIP) address, which can be automatically created with a unique name when the service is installed in the scheduler. As Hindman explained, “we can assign IPs that don’t necessarily have to come from network overlays, just as easily as we can with overlays. . . Those IPs might be able to be dynamically provisioned by the networking infrastructure, that an organization already has in its data center.” He cited Juniper Networks’ Contrail as one example of a pre-existing SDN platform with which DC/OS may be integrated.
It is the type of job that an extension could perform, assuming the need for creating an extension in such a system were not altogether artificial.
“By all means, an organization can have the networking infrastructure,” explained Hindman, “to be able to dynamically assign an IP [address] that is not for a network. But regardless, developers can stop having to worry about the support stuff, and focus on the IPs that they’re actually getting assigned to them.”
The quest for seamlessness in a software platform has historically been one of the most praised forms of “innovation.” Part of improving the customer experience, it’s been said, is eliminating the number of clicks in moving between Service A and Service B.
But it’s a very familiar argument, which is why any kind of “bundling” or “tying” of platform components raises an ethical issue. Sadly, when someone takes that issue to its ultimate extreme, we sometimes feel ashamed for questioning every little thing, lest we become mistaken for conspiracy theorists. Yet making a market more open and transparent does not make it any less competitive. When openness is defined as presenting choice, and the choice changes — even for the most technical of reasons — it’s only fair and reasonable that we should know why.
The truth is, in the realm of technology, nothing ever has been, and nothing ever will be, “a no-brainer.”
Cisco, Docker and Mesosphere are sponsors of The New Stack.
Feature image: Ben Hindman. Photo by Scott M. Fulton.