The continuous shifting of the container ecosystem means there are contrary views surrounding virtual machines, containers and the roles they play in automated, scaled-out environments. There are also many unknowns, as most technical professionals have little experience developing on distributed platforms. Most people have developed on single hosts or managed systems as administrators, managing machines and their virtualized environments.
Running clusters and orchestrating thousands of containers is an entirely new game. It puts software at the center stage, with a greater need to monitor components than the application itself. It’s a world that requires an advanced appreciation of automation via APIs, and orchestration that takes into account the need for scheduling, cluster management and a host of other matters, such as securing nodes, health checks and prioritization.
Running clusters and orchestrating thousands of containers is an entirely new game.
Today we see how orchestration for containers changes the lens of how we view applications. Microservices orient developers to focus on services and how those services differentiate applications. Containers are catalysts for making these applications behave according to the components on distributed clusters. Apache Mesos, Docker Swarm, HashiCorp Nomad and Kubernetes are good examples of this. They help systems operators model their clusters and make services accessible.
Many software companies have built platforms that are almost entirely ephemeral. For companies adopting elastic, scaled-out platforms, the concept is often about how to eliminate the VM entirely, build on bare metal and create container-based clusters.
This all adds up to a change in how we think about automation and orchestration. But one principle remains: there’s a push to implement programmable infrastructure, applying methods and tooling established in software development onto the management of IT infrastructure, across thousands of clouds. It’s a movement that will lead to any number of new practices and discoveries.
Facing the Current, Complex Reality
The software industry at large has failed to truly implement the ideas behind programmable infrastructure, said Solomon Hykes, Docker founder and chief technical officer. It’s difficult to become an effective developer without ten years or more of programming experience. Scale needs to be at the scale of the Internet, Hykes said.
“So clustering, orchestration, networking between containers, storage across more than one container, that shared storage, security, things like that, verification of contents so that you can track the authenticity and provenance of the containers you’re about to deploy,” Hykes said. “All of these are part of the problem of building and deploying distributed applications. So it’s a big list of problems and you need a combination of tools to address these problems. And you need those tools to integrate. They need to work together to form a platform, otherwise you’ve got a giant puzzle and the pieces will never fit together. You’ll never complete the puzzle.”
Hykes will tell you that Docker takes an incremental approach with tool development and does not try to “boil the ocean” by developing one platform to do it all. It’s an approach that applies to the overall Docker and container ecosystem. That is, in part, due to the complexity that scale brings and the very nature of container technology itself.
Docker and containers are processes. They make delivering components easier. A container does not carry an operating system with it. That makes the container lighter and easier to manage. It does not require configuration to the machine itself. And a container is disposable — it embodies the concept of immutable infrastructure; live instances are never directly changed, but rather replaced as their configuration changes. Due to their lightweight and easy reproducibility, containers, in their current packaged form, introduce new complexities previously not understood.
A container is disposable — it embodies the concept of immutable infrastructure.
With containers comes a change in the role that data centers will play. The research and development that has existed internally will move outwards into open source communities.This kind of change could have happened with VMs, but due to the great tooling and speed at which containers are easily managed and ported, it makes it so much more natural than it does with VMs.
The software will, in turn, continue to get developed at an even more rapid pace, built as integrated technologies for the developer and the operations person, who can no longer separate their duties. There is no wall anymore between Dev and Ops, just a pipeline of continuous delivery.
Removing that wall between Dev and Ops is the most important shift of all that will come as container adoption becomes more widespread, accelerated by open source development. The software to manage clusters will be the orchestration platforms, working on data planes that make container-based clusters fully mobile.The infrastructure itself becomes centered on the application. Components will be like threads and connect across distributed platforms via APIs and other means, depending on the context.
Cloud-Native and Programmable Infrastructure
The concept of cloud-native has set the stage for how organizations develop a programmable infrastructure, and the complexity is astounding.
There are a number of overlapping domains, especially when we think about container management, said Chris Ferris, distinguished engineer and chief technical officer of open technology at IBM Cloud, in an interview with The New Stack. There is Kubernetes, Swarm, Mesos, initiatives around OpenStack Magnum, etc., all to orchestrate, manage and schedule containers. Then you’ve got the likes of Cloud Foundry, which is also doing container scheduling and orchestration, but it’s a little bit more hidden.
“All of these technologies are independently developed in independent groups, in independent communities, …ultimately these things have to start coalescing, coming together, or at least providing the ability that we can integrate between OpenStack and Kubernetes. For instance, if I’m running containers in a Kubernetes pod and I want to integrate those capabilities with something I’ve got running in a VM and OpenStack, how do I do that from a networking and storage perspective, how do I share the networking storage across those platforms?”
There are arguments about what goes underneath, in the middle or on top, Ferris said. OpenStack Magnum is getting built to provision the likes of Kubernetes and Mesos.
This may lead to different integrations that allow customers to build from components that suit their needs, and potentially an architecture that everybody can agree upon. What is needed is fault tolerance, self-healing, easy roll-outs, versioning, and the ability to easily scale up or down. Containers should run on a cloud service or your own hardware — and have them just run at whatever scale is necessary, never going down and never paging the Ops team. This is what people call orchestration.
Defining Automation and Orchestration
IBM’s Doug Davis gave us his thoughts on defining the automation and orchestration space:
“I think largely, when we think about automation, you think about writing the scripts that are integrated into a platform, like Chef, Jenkins or Ansible …that is actually driving the actual behavior; and we think about orchestration as the platforms themselves that are providing that facility to be able to orchestrate the order in which things are going. That’s the orchestration. The automation itself is just the actual execution of the point-in-time script.”
Many of our first questions to experts addressed this same distinction between automation and orchestration, and experts had numerous ways of thinking about it. Ben Schumacher, Innovation Architect at Cisco, agreed with the inherent relationship between automation and orchestration. “We quickly learned that, while there is some separation into what experts and users most closely associate with each label, they are serving essentially the same purpose,” he said. His colleague, Ken Owens, chief technology officer of cloud infrastructure services (CIS), described a more detailed thinking about the two strategies:
“As you move … to this new container ecosystem, you’re seeing all of that underlying infrastructure becoming infrastructure as code,” Owens said. “And the ecosystem around containers, and Mesos, and then Kubernetes around orchestration and scheduling with Marathon as well, brings in a whole new interesting layer. And from Cisco’s standpoint, we’re very interested in not just what’s happening in that layer from a cloud-native development standpoint, but we’re also interested in what are the enterprise-like feature sets around quality, and around workload and cluster management, to grant the granularity that our customers require. And how do we enhance that with better networking capability, better service discovery and service management capability, and better security capability?”
Creating New Architectures and Pipelines
Orchestration helps to complete the end-to-end DevOps pipeline in many ways. For the developer, it starts with the local environment on their laptop. But for the platform to work, it necessitates automation of the entire infrastructure.
DCOS provides an operating system that abstracts the resources of an entire cluster of machines and makes them available to the developer like one big box, said Michael Hausenblas, a data center application architect and DevOps advocate at Mesosphere. Frameworks run on DCOS.
“For many people out there, I think the main thing, really, is how can you realize this next step after Docker build, Docker run? How to really put the containers into production and keep it running and fulfill these AppOps [application operations] tasks? Part of it is naturally the CI/CD pipeline. Again, for us, that’s the thing where Mesos is kind of unique and able to run all kinds of different, what we call, frameworks.”
For example, Jenkins may cover the CI/CD pipeline, together with Cassandra, Kafka and Spark that handle analytics, with some web server from NGINX that serves the website. All of these different applications and frameworks in your entire life cycle can run together in one cluster.
Container orchestration will mostly require developers to just start making new projects, tools, and solutions, Hausenblas said. How people learn to manage container orchestration will, in turn, transform how they think about automated environments.
Containers make for a realistic mechanism to build these new architectures, but with implementation comes a need for new tooling and self-healing to manage how the systems work across distributed platforms.
“To me, an orchestration platform is a platform that can orchestrate multiple other system tools, orchestration engines — I would be talking about specific schedulers at that point, things like Zookeeper and Mesos and Marathon,” Owens said. “Things that are kind of the end point to what you’re trying to orchestrate, plus configuration management systems, plus automation tools, tooling systems or platforms. So it’s kind of like orchestrator of orchestrators, orchestrator of runtime and configuration management tools or toolsets.”
And there are so many tools still needed. Yelp, for instance, discovered issues with “zombie processes” spawned by signaling issues when using containers. To eliminate the signaling issues, Yelp developed dumb-init, an initialization system that runs inside of Docker containers.
The clear need for tooling comes with the assumption of DevOps practices in container orchestration environments. As we have stated in our own research, it becomes apparent that addressing the needs of specific job roles is essential when considering tool environments. In our survey of container users, 58 percent said integrated tools for both application development and IT operations are extremely important.
Discovering What Tools Are Needed
The tooling question will take some time to define as the use cases for container orchestration are just emerging. And due to the immaturity of the platforms, many expert users are just now discovering what tools are needed for these distributed platforms to operate efficiently.
Cluster management for scaled-out container orchestration can still be rudimentary. Its immaturity means that issues, such as prioritization of workloads, are yet to be completely defined.
Definitions for cluster management require policies for who can manage the clusters, and when, where and what mechanisms are needed for it, said Ken Robertson, lead architect at Apcera. These issues can be defined with cluster administration tools, but then there are the problems that come with rolling out and the automated communications between different aspects of the technology stack.
It’s a lot of automation in terms of managing the individual blocks, but also the entire facade.
And, despite the effort to move to application architectures, it’s the resources, the machines making up the clusters, that defines how the provisioning is managed.
“A lot of that falls on — what are the machines making up this cluster environment?” Robertson said. “If you’re running on AWS, GCE, or cloud environments, you have APIs to be able to provision more machines. In some case, say you provision machines with a lot of CPU and RAM, but are talking about hardware mapping, you have something that can only be consumed by one thing at a time.”
This means the user will have a limit on how many of these workloads they can run at a time. The operator will need visibility into the limitation around that resource and the ability to provision more if needed. But the resources involved in provisioning aren’t infinite.
“How do you turn on more, and how do you — in some cases — procure more of them where there might be a few weeks lead time to get them?” Robertson asked.
Herein lies the complexity of distributed systems. There are different generations of hardware architecture that have to be managed. And with cluster management, there’s also the interest in managing cost. It may be that a customer wants to be notified if there is a price drop on a service. The input would be fed into the orchestration platform to take advantage of the pricing. The integrations would come through the third-party services, eliminating the need for one end-all interface.
Addressing the Complexity of Scale
The complexities with scale speak to the need for autonomic systems, said Alex Heneveld, co-founder and chief technology officer at Cloudsoft. It’s a concept dreamed of for decades, but with container-based workloads, there’s at least now a discussion point, as humans can’t manage scaled-out architectures and fix them manually. There becomes the need for self-healing environments that can break complex tasks into smaller tasks, subsystems that generally do what they need to do.
In autonomic systems, a sensor is emitted when there’s a problem. A management platform analyzes the sensors. Effectors, also described as levers, exist to make changes. It’s this combination of sensors giving metrics, and effectors letting users control things, that is the encapsulation of a system.
“Systems can be managed by another system that itself has sensors and effectors,” Heneveld said. “Within it, it’s doing the monitoring, analyzing, planning and executing to take care of the systems underneath it. But you can almost view it like an org chart or a hierarchy. We have it looser than some very strict theories in this realm, but the idea is you can make very complex systems just by composing simpler systems. If we look at what’s happening around microservices, we’re building up complex systems from simpler systems.”
This hierarchy of systems has similarities to microservices that also build up complex systems from simpler systems.
Apache Brooklyn is an open source project that, at its core, is built on the principles of autonomous systems. It allows the user to unpack a complex system to look inside it. The user can keep unpacking the system to look deeper inside, all the way to the plumbing underneath, if need be.
“If I’m lucky and someone is hosting some of my substrate, I don’t need to look that far,” Heneveld said. “If I’ve got a hosted container service, for instance, I’m not interested in the plumbing underneath it. But I do want to know that every one of those containers is doing the right thing. So, the metrics that we get back from each piece becomes absolutely essential to informing that model, allowing us to maintain a healthy system, and correct it, and improve it when it’s needed, whether that’s scaling, or failing over, or DR [disaster recovery]. And in our world, those are implemented as policies, where this ‘monitor, analyze, plan, execute’ logic sits within the autonomic framework.”
For users and vendors existing between a VM-dominated world and a container-based one, it’s not a problem; rather, it’s a space rich with emerging practices, technologies, and solutions to problems. This balance of current forces is bringing the Dev and Ops teams closer together than ever before. And even as the space begins to change, and reviews about the role of VMs and containers in automated environments at scale matures, there is still much to be learned about managing virtualized environments; this space is still a new one for many users and vendors alike, and there are many unknowns, but that just places greater emphasis on the need to manage and orchestrate the various components.
From automation and orchestration to scheduling, cluster management, service discovery, security and more, it all adds up to a change in how we think about automation and orchestration. Embracing this change, and learning to focus on developing best practices, is a larger journey for the entire container ecosystem; a journey fueled by open source software, the need for connectivity, passionate communities of users, and economic factors in the business models of vendors.