The Continuum of Cloud-Native Topologies
Like a new universe, the cloud-native ecosystem has many technologies and projects quickly spinning off and expanding from the initial core of containers. An especially intense area of innovation is in workload deployment and management technologies. While Kubernetes has become the industry standard general purpose container orchestrator, other technologies like serverless attempt to abstract complexity associated with managing hardware and operating systems. The differences between these technologies are often small and nuanced which makes it challenging to understand the benefits and tradeoffs between them. One of the most common questions we hear from customers is around how these different technologies address their scenarios and how to choose between them.
A useful way to think of cloud-native technologies is as a continuum spanning from virtual machine (VM) to containers to serverless. On one end are traditional VMs operated as stateful entities, as we’ve done for over a decade now. On the other are completely stateless, serverless apps that are effectively just bundles of app code without any packaged accompanying operating system (OS) dependencies. In between are things like Docker, AWS’s new Fargate service, Containers as a Service (CaaS) platforms and other technologies that try to provide a different balance between compatibility and isolation on one hand, and agility and density on the other. That balance is the reason for such diversity in the ecosystem. Each technology tries to place the fulcrum at a different point, but the ends of the spectrum are consistent: One end prioritizes familiarity and separation while the other trades off some of those characteristics for increased abstraction and less deployment effort.
There’s a place for all these technologies — they’re different tools with different advantages and tradeoffs and we typically see customers using at least a few of them simultaneously. That heterogeneity is unlikely to change as organizations bring in increasingly more critical workloads into their cloud-native stacks, especially those with deep legacy roots.
As we consider all of these technologies, keep in mind the Cloud Native Computing Foundation’s Charter, which defines cloud-native systems as having the following properties:
(a) Container packaged. Running applications and processes in software containers as an isolated unit of application deployment, and as a mechanism to achieve high levels of resource isolation. Improves overall developer experience, fosters code and component reuse and simplifies operations for cloud-native applications.
(b) Dynamically managed. Actively scheduled and actively managed by a central orchestrating process. Radically improves machine efficiency and resource utilization while reducing the cost associated with maintenance and operations.
(c) Microservices oriented. Loosely coupled with dependencies explicitly described (e.g. through service endpoints). Significantly increases the overall agility and maintainability of applications. The foundation will shape the evolution of the technology to advance the state of the art for application management, and to make the technology ubiquitous and easily available through reliable interfaces.
While it may be surprising to see VMs discussed in the context of cloud native, the reality is that the vast majority of the world’s workloads today run “directly” (non-containerized) in VMs. Most organizations we work with don’t see VMs as a legacy platform to eliminate, nor simply as a dumb host on which to run containers. Rather, they acknowledge that many of their apps have not yet been containerized and that the traditional VM is a still a critical deployment model for them. While a VM not hosting containers doesn’t meet all three attributes of a cloud-native system, it nevertheless can be operated dynamically and can run microservices.
VMs provide the greatest levels of isolation, compatibility, and control in the continuum and are suitable for running nearly any type of workload. Examples of VM technologies include VMware’s vSphere, Microsoft’s Hyper-V and the instances provided by virtually every IaaS cloud provider, such as Amazon’s EC2. VMs are differentiated from “thin VMs” to their right on the continuum because they’re often operated in a stateful manner with little separation between OS, app and data.
Less a distinct technology than a different operating methodology, “thin” VMs are typically the same underlying technology as VMs, but deployed and run in a much more stateless manner. Thin VMs are typically deployed through automation with no human involvement, are operated as fleets rather than individual entities and prioritize separation of OS, app and data. Whereas a VM may store app data on the OS volume, a thin VM would store all data on a separate volume that could easily be reattached to another instance. While thin VMs also lack the container attribute of a cloud-native system, they typically have a stronger emphasis on dynamic management than traditional VMs. Whereas a VM may be set up and configured by a human operator, a thin VM would typically be deployed from a standard image, using automation tools like Puppet, Chef or Ansible, with no human involvement.
Thin VMs are differentiated from VMs to their left on the continuum by the intentional focus on data separation, automation and disposability of any given instance. They’re differentiated from VM integrated containers to their right on the spectrum by a lack of a container runtime. Thin VMs have apps installed directly on their OS file system and executed directly by the host OS kernel without any intermediary runtime.
For some organizations, especially large enterprises, containers provide an attractive app deployment and operational approach but lack sufficient isolation to mix workloads of varying sensitivity levels. Recently discovered hardware flaws like Meltdown and Spectre aside, VMs provide a much stronger degree of isolation but at the cost of increased complexity and management burden. VM-integrated containers, like Kata containers and VMware’s vSphere Integrated Containers, seek to accomplish this by providing a blend of a developer-friendly API and abstraction of app from OS, while hiding the underlying complexities of compatibility and security isolation within the hypervisor.
Basically, these technologies seek to provide VMs without users having to know they’re VMs or having to manage them. Instead, users execute typical container commands like docker run and the underlying platform automatically and invisibly creates a new VM, starts a container runtime within it and executes the command. The end result is that the user has started a container in a separate operating system instance, isolated from all others by a hypervisor. In general, these VM-integrated containers typically run a single container (or set of closely related containers akin to a pod in Kubernetes) within a single VM. VM-integrated containers possess all three cloud-native system attributes and typically don’t even provide manual configuration as an optional deployment approach.
VM-integrated containers are differentiated from thin VMs to their left because they’re explicitly designed to solely run containers and tightly integrate VM provisioning with container runtime actions. They’re differentiated from pure containers to their right on the continuum by the mapping of a single container per OS instance and the integrated workflow used to instantiate both a new VM and the container it hosts via a singular, container-centric flow.
Containers deliver all three cloud-native system characteristics and provide a balanced set of capabilities and tradeoffs across the continuum. Popularized and best known by the Docker project, containers have existed in various forms for many years, and have their roots in technologies like Solaris Zones and BSD Jails. While Docker is a well-known brand, other vendors are adopting its underlying technologies of runc and containerd to create similar but separate solutions.
Containers balance separation (though not as strong as VMs), excellent compatibility with existing apps and a high degree of operational control with good density potential and easy integration into software development flows. Containers can be complex to operate, primarily due to their broad configurability and the wide variety of choice they present to operational teams. Depending on these choices, containers can be either completely stateless, dynamic, and isolated or highly intermingled with the host operating system and stateful, or anywhere in between. This degree of choice is both the greatest strength and the great weakness of containers. In response, the market has created systems to their right on the continuum (such as serverless) to both make them easier to manage at scale and to abstract out some of their complexity by reducing some configurability.
Containers are differentiated from VM-integrated containers to their left by not using a strict 1:1 mapping of container to VM, nor wrapping the provisioning of the underlying host operating system into the container deployment flow. They’re differentiated from Container-as-a-Service platforms to their right by requiring users to be responsible for deployment and operation of all the underlying infrastructure, including not just hardware and VMs but also the maintenance of the host operating systems within each VM.
As containers grew in popularity and diversity of use cases, orchestrators like Kubernetes (and its derivatives like OpenShift), Mesos and Docker Swarm became increasingly important to deploy and operate containers at scale. While abstracting much of the complexity required to deploy and operate large numbers of microservices, composed of many containers and running across many hosts, these orchestrators themselves can be complex to set up and maintain. Additionally, these orchestrators are focused on the container runtime, and do little to assist with the deployment and management of underlying hosts. While sophisticated organizations often use technologies like thin VMs wrapped in automation tooling to address this, even these approaches do not fully unburden the organization from managing the underlying compute, storage and network hardware. CaaS platforms provide all three cloud-native characteristics by default and, while assembled from many more generic components, are highly optimized for container workloads.
Since major public cloud IaaS providers already have extensive investments in lower level automation and deployment, many have chosen to leverage this advantage to build complete platforms for running containers that strive to eliminate management of the underlying hardware and VMs from users. These Container-as-a-Service platforms include Google Container Engine, Azure Kubernetes Service and Amazon’s EC2 Container Service. These solutions combine the container deployment and management capabilities of an orchestrator with their own platform-specific APIs to create and manage VMs. This integration allows users to more easily provision capacity without the need to manage the underlying hardware or virtualization layer. Some of these platforms, such as Google Container Engine, even use thin VMs running container-focused operating systems, like Container-Optimized OS or CoreOS, to further reduce the need to manage the host operating system.
CaaS platforms are differentiated from containers on their left by providing a more comprehensive set of capabilities that abstract the complexities involved with hardware and VM provisioning. They’re differentiated from on-demand containers to their right by typically still enabling users to directly manage the underlying VMs and host OS. For example, in most CaaS deployments, users can SSH directly to a node and run arbitrary tools as a root user to aid in diagnostics or to customize the host OS.
While CaaS platforms simplify the deployment and operation of containers at scale, they still provide users with the ability to manage the underlying host OS and VMs. For some organizations, this flexibility is highly desirable, but in other use cases, it can be an unneeded distraction. Especially for developers, the ability to simply run a container, without any knowledge or configuration of the underlying hosts or VMs can increase development efficiency and agility.
On-demand containers are a set of technologies designed to trade off some of the compatibility and control of CaaS platforms for lessened complexity and ease of deployment. On-demand container platforms include Amazon’s Fargate and Azure Container Instances. On these platforms, users may not have any ability to directly access the host OS and must exclusively use the platform interfaces to deploy and manage their container workloads. These platforms provide all three cloud-native attributes and arguably even require them; it’s typically impractical to not build apps for them as microservices, and the environment can only be managed dynamically and deployed as containers.
On-demand containers are differentiated from CaaS platforms to their left by the lack of support for direct control of the host OS and VMs and the requirement that typical management occurs through platform-specific interfaces. They’re differentiated from serverless on their right because on-demand containers still run normal container images that could be executed on any other container platform. For example, the same image that a user may run directly in a container on their desktop can be run unchanged on a CaaS platform or in an on-demand container. The consistency of an image format as a globally portable package for apps, including all their underlying OS-level dependencies, is a key difference from serverless environments.
While on-demand containers greatly reduce the “surface area” exposed to end users and, thus, the complexity associated with managing them, some users prefer an even simpler way to deploy their apps. Serverless is a class of technologies designed to allow developers to provide only their app code to a service which then instantiates the rest of the stack below it automatically. In serverless apps, the developer only uploads the app package itself, without a full container image or any OS components. The platform dynamically packages it into an image, runs the image in a container and (if needed) instantiates the underlying host OS, VM and hardware required to run them. In a serverless model, users make the most dramatic trade-offs of compatibility and control for the simplest and most efficient deployment and management experience.
Examples of serverless environments include Amazon’s Lambda and Azure Functions. Arguably, many PaaS platforms such as Pivotal Cloud Foundry are also effectively serverless even if they have not historically been marketed as such. While on the surface, serverless may appear to lack the container-specific, cloud-native attribute, containers are extensively used in the underlying implementations, even if those implementations are not exposed to end users directly. Serverless is differentiated from on-demand containers to their left by the complete inability to interact with the underlying host and container runtime, often to the extent of not even having visibility into the software that it runs.
Today, some vendors are already experimenting with combinations of these approaches — like Amazon’s EC2 Kubernetes Service being integrated with their Fargate technology to effectively create an on-demand CaaS platform. Additionally, as users gain more experience running these platforms, the available options may naturally narrow down so only the top tool choices remain on the market.
The trend of “software eating the world” is pushing nearly every organization to invest in software as a competitive differentiator for their business. This is driving great demand for platforms that enable developer agility and operational scale, which has led to a wide variety of choice for cloud-native topologies. Each brings its own set of advantages and tradeoffs — but the decision is not which one to use, but rather which ones. Few organizations will find a single option that’s a great fit for all their needs and instead will find several options, each providing advantages for different workloads and use cases as they change and grow.