Containers: What’s New, What Isn’t, What Matters?

I remember the first time someone excitedly explained Solaris Zones to me and thinking to myself, “Neato, why would you want to do that?” At the time I was a software developer with a basic understanding of systems and operations. I understood ‘what’, but ‘why’ wasn’t immediately obvious.
That was a decade ago and a lot has happened since then.
While there have been high profile projects and services using containers (Google contributed cgroups to the Linux kernel in 2007, and notably Heroku, Cloud Foundry and dotCloud were all based on containers since at least 2011), mainstream IT didn’t get excited about the potential of Linux containers until quite recently. So what changed? To be perfectly blunt, the main difference has been the introduction of Docker (which was originally spawned as a project at dotCloud).
Linux operating system virtualization or ‘containers’ have been one of the primary tools Google developed to manage and leverage infrastructure. There were already a number of projects that provided tools for managing containers, but Docker provided a ‘how’ that made the technology accessible and compelling for mass adoption. LXC, OpenVZ, Google’s lmctfy, Cloud Foundry’s warden and now Docker’s libcontainer are all projects for managing different aspect of Linux containers. (Docker was wrapping LXC until libcontainer was created to unify the interfaces to cgroups, namespaces and any other fiddly bits at the bottom.) Some of these projects have been available for years, so what did Docker change? In my opinion, two big things, convenient defaults and image management, both of which shifted using containers from an enabling technology for people with specialized understanding to something that was actually easy for the average developer to have up and running in a spare hour or two. I’ve been on projects getting great benefit from LXC/OpenVZ and/or artisanal hand crafted copy-on-write image management solutions. We never connected the dots to put these together and ended up with little that was generic or reusable outside of those projects. Docker brought image versioning and Linux containers together, packaged for mass consumption.
…if nothing else, the exuberance for Docker is forcing organizations to revisit the processes and architectures that hold them back
Historical Context
In the old days, servers were relatively expensive and getting the most out of one often meant running a number of unrelated services on the same hardware, which often lead to conflicting software dependencies and operational confusion between processes contending for resources. Hypervisors running virtual machines allowed for breaking the services apart into more manageable deployments, but at scale the overhead of virtualization can become conspicuous. Containers provide service isolation but without much more overhead than starting any other process. The best way to think of containers is a single kernel space supporting multiple user spaces that don’t know each other exist (although in practice this isn’t entirely true). For the sake of completeness, projects like OpenMirage and OSv take this a step further running applications without the overhead of the full multi-user operating system and one can even imagine a future where some services may get implemented in silicon. (This is mostly thinking about the server, but projects like ZeroVM, Qubes OS and Bromium are also worth noting in the evolution of these virtualized stacks, though these have slightly different focus in practice. Additionally, networking and storage are undergoing a similar evolution.)
These technologies follow similar adoption curves, all predicated on the proliferation of relatively cheap, relatively available and ever faster commodity hardware. Distributions facilitated Linux mass adoption as packaging the low level drivers, userland tools and software lowered the level of specialized knowledge required to be successful and provided a better user experience. Hypervisors were pioneered at IBM in the 1960s, but adoption doesn’t explode until the late 1990s as the implementations and tooling made virtualization technology accessible. Docker has effectively lowered the barrier of entry to using containers while simultaneously enabling flexible new workflows producing deployable and indexed images that can be shared and socialized, privately or publicly.
Diving A Little Deeper
Docker brought together image management and Linux container process management with a unified interface. The images are the standardized artifacts of deployment that make the ‘shipping container’ metaphor work. Another notion of ‘container’, which might properly be considered an accident of history based on the LXC (LinuX Containers) project name (as opposed to jails, zones, partitions or something else) is the Docker processes run in cgroups and namespaces. Container in this sense is less about standardizing units and more about ‘containment’. The easiest and simplified way to frame how cgroups and namespaces work is that cgroups set limits on access to resources while namespaces limit visibility. To make this more concrete, by combining namespaces, cgroups and images, processes use an image as the root filesystem with limited access and visibility to things like cpus, memory, mounts, users, processes and networks in such a way that they appear to be independent machines. Trying to find documentation, and reading what can be found for cgroups and namespaces will quickly convince most people that these can be complicated and cumbersome to manipulate directly. LXC wrapped this with command line tooling, and many used LXC to great advantage, including Docker before libcontainer. By constraining the options and providing defaults, Docker means the average developer can have a container running a few minutes after installing some packages and more importantly any modification to the image becomes a deployable artifact. In a sense, nothing new, but putting existing things together in a novel way can be a force multiplying innovation. People that never knew LXC existed and in some cases never used Linux are using boot2docker as part of their software development and deployment process.
Docker brought together image management and Linux container process management with a unified interface.
Where do We Go Now
Containers are finally going to be adopted by mainstream IT (all the Solaris and BSD admins are nodding and saying ‘finally’ under their breath), but I have a personal wager that the preponderance of containers will run inside VMs for the foreseeable future, if not forever. This is especially true in the context of cloud computing, but even behind the firewall most organizations will gladly pay the ‘virtualization tax’ in exchange for the mature tooling and security VMs currently provide. While VMs have stronger resource and security isolation, using containers to slice and dice capacity inside VMs provides more flexible granular control over resources (which can be especially beneficial comparing per GB public cloud pricing for different configurations and sizes). There is also a portability issue. A hypervisor can run any operating system that would work on the presented machine architecture, but containers are going to be dependent on particular kernels. Docker style image management and workflows for every platform would be great, but that is not going to make Linux containers run on Windows (or Solaris, or BSD, etc.) or vice versa and attempts to do so would inevitably start to look suspiciously like a hypervisor (or worse).
Using Docker in particular can also be transformative to workflows when the primary product of work becomes a deployable image that can be put directly into production or be the baseline for further collaboration. The operational advantages of being able to deploy full containers in seconds from a single image transforms how applications will be managed going forward. Docker adoption will also drive greater awareness of Linux containers, which is likely to drive kernel innovation in cgroups and namespace features and an ecosystem of tools for the management of networks, storage and all the other things that need to be handled around containers to do useful work. This is already happening now. If the recent announcement between Docker and Microsoft is any indication, Linux container adoption will also help motivate Microsoft to bring their ecosystem better process isolation and resource limits. These are all great things.
We are finding better ways of creating software and getting that software on computers. The pieces Docker brings together may not technically be new, but this novel combination of technology can be enabling, and if nothing else, the exuberance for Docker is forcing organizations to revisit the processes and architectures that hold them back. I’m long on computers but computers are only as useful as the software running on them. Computers running software is not a zero sum game. There will be more and more computers of all shapes and sizes; hardware, hypervisors, and containers. Most solutions reveal (or create) new problems (opportunities?). Containers don’t solve all our problems, but container adoption represents an inflection point in the evolution of technology driven by a need for speed and scale.
That should be enough to digest for the moment and a nice segue to what I hope to talk about next time, solving the problem of going fast and safe at scale.
After grad school, fate brought Andrew Clay Shafer to venture funded startups, where he became fascinated with creating value while optimizing tools and process at the intersection of people and technology. He’s currently doing a tour of duty as senior director of technology at Pivotal but along the way, Andrew co-founded Puppet Labs, was the first VP of Engineering at Cloudscaling, helped with many an event as a core organizer of devopsdays, and has given more than his share of conference presentations on everything from OpenStack to Organizational Learning.