The writing is on the wall: Docker and the container ecosystem are growing like crazy. Developers, operations … the entire breed is on the road to experiment, learn and share as much as possible by embracing the growing wave. But the writing on the wall still misses the big picture: Organizations set apart from unicorns — those enterprises living with a monolith and multiple apps — are still in the early stages of using containers and the ecosystem projects.
It’s not hard to tell why: A non-unicorn will do a mainstream roll-out of a piece of technology (read: production), and other non-unicorns will see that and follow suit. Call it the mirror neuron effect of IT in the enterprise.
Container Adoption Lifecycle
To fully understand this, let’s look more closely at the adoption lifecycle of these non-unicorns. They mostly would have monolithic, or some form of multi-apps, in their enterprise architecture. Each such multi-app could be a monolith. Some non-unicorns would be in the mode of experimenting with self-contained services, or microservices. Containers typically get advertised internally as micro-VMs by development/operation folks in order to get attention from the business side of things — to essentially get funding to operate them in non-critical parts of the overall architecture.
More likely than not, this adoption of containers follows the topology illustrated below, starting from the bottom:
The foundation of this adoption is building confidence in container technology. This is usually practiced through having the developers evaluate containers for dev and test environments. The biggest hurdle is containerization of the applications. Most of these candidate applications are designed for a non-container environment. Ideally, these applications have to be refactored to make them convenient for containerization. Here are the key ideas around containerizing applications:
- The process begins with identifying base images that are trusted and are relatively lightweight. Any such base image must have a history in the form of a Dockerfile that can be reviewed and verified for potential issues or vulnerabilities.The choice to have multi-process or a single process container is debatable, and generally weighed for pros and cons.
- For a container that needs data persistence, like a database, the choice of tools, like Flocker from ClusterHQ that allows management of stateful containers through portable datasets, are useful. Alternatively, for the development environment, it would be easy to start with disposable data containers. These disposable containers, especially databases, can host seed data through data scripts. The data scripts load the database with the required information needed to test the application. This allows for a quick test bed for running containerized applications that need persistence in dev and test infrastructure.
- A containerized application tier, like a web server, application or database, needs the runtime configuration to be injected from external sources like environment variables, or a centralized infrastructure like service discovery system. The runtime configuration may include endpoints to other dependent services or environment specific parameters. For starters, making changes in the application by having all such configurations made available through OS environment variables is easier. A long term refactoring may include the use of service discovery infrastructure through changes in the application implementation.
- A lot of best practices are on the Web for creating usable Dockerfiles and Docker images. Docker released its security whitepaper, which also addresses key considerations when setting up Docker and creating container images. Learning to build optimized Dockerfiles is essential for development teams to gain confidence in their journey towards containerization.
- In the case of multi-app architectures deployed as independent containers on a single host, port numbers are used as a means to do container discovery. This allows for inter-communication between various tiers of the application.
- One other aspect of containerization of such applications is to have the relevant process run inside the container in the foreground, with the logs being routed to the standard output. This allows for a convenient lifecycle for application containers, where the termination of such a process leads to the termination of the container itself. Also, the logs available on the standard output of these containers could be used to route them to a log server, like Loggly, or any other logging service pre-existing within the infrastructure. These practices are also advocated through the best practices for composing Dockerfiles.
Once the application containerization is completed, a development workflow needs to be created which would allow the developers to continuously test changes in application. Ideally this is possible by using something, like Docker Compose, to orchestrate the creation of the application containers when deploying on the singular host.
One of the important infrastructure ingredients in this step of adoption is having a centralized registry of all container images. The container images are stored in this registry, along with their metadata, and can also act as a versioning control system. Docker offers the private Registry 2.1 distribution that can be set up internally, or alternatively, services like Docker Hub and Quay.io offer similar capabilities as a SaaS.
The main goal of this foundation layer is to have developers get acquainted with the moving parts of a containerized environment.
The next layer of adoption includes usage of a container orchestration system, like Mesosphere, Kubernetes and Docker Swarm. Orchestration involves deploying the container workload across a pool of hosts, each running a Docker daemon. Such orchestration tools also provide a deployment manifest that exposes a declarative format to capture deployment topology and data. Tools like Mesosphere and Kubernetes also provide fault tolerance and efficient utilization of the participating hosts through implementation of clustering techniques. For public cloud users, Google Container Engine and AWS Elastic Container service offer similar capabilities.
The other participatory ingredient in this layer is the practice of using a bakery.
The term bakery came into prominence due to extensive coverage of the Netflix Cloud infrastructure.
The core idea is relatively simple: the changes made to an application, source code or infrastructure, are all deployed as idempotent machine images.
Tools, like Aminator from Netflix, remain an effective reference implementation of this concept. There are two different schools of thoughts around the implementation of a bakery.
For some, the output of a bakery is a sealed machine snapshot that remains environment- and infrastructure-agnostic. In this case, the source code, configuration and all artifacts related to the particular service or function are packaged as build-once-run-anywhere deployable.
The other school of thought addresses a bakery in a slightly different way. This includes having a completely prepared machine image, but lacking the source code or configuration. Only at the time of deployment, the runtime instance (container or VM) decides where and how to fetch the source code and configuration. This allows a certain amount of flexibility in the overall bakery process, and essentially produces parameterized baked images.
The bakery must adhere to build standardized base container images that are then used for deployment through the orchestration tool. Usually the bakery can be implemented in the form of a hierarchy — global and local bakeries. The scope of a global bakery is the entire enterprise, while a local bakery exists only at the project level. Each such bakery also co-exists with its own image registry that holds the container images. The following illustrates the overall flow between global and local bakeries:
For example, an Ops staff working at the level of a global bakery composes standardized base container images that are then pushed to the global image registry. These container images in the global image registry are then used at the local registry, which remains specific to a particular project. The Ops staff working at the scope of the local bakery can pull these global container images and either customize or extend them to suit the project’s needs. These customized images are then pushed to the local image registry. The development workflow in a particular project uses the local registry specific to that project. In an organization, there is usually a single global bakery and multiple local bakeries.
Bakeries can be extended to include test automation, which is used to test the images getting built in the bakery. Using this automation, bakery operators — ideally Ops — can create test cases to ensure the baked images meet the expectation of the container images. These expectations can include verification checks that cover some of the following ideas:
- A container image does not have too many intermediate container images in its history.
- The container image exposes the appropriate ports required from the application container.
- The container image should avoid non-required utilities (otherwise it becomes bloatware).
- The container image must not assume the specifics of a particular Docker host, like the IP address or the operating system.
The Docker host pool, managed by the orchestration system, could also scale out by adding more container images. Enterprises could reuse any pre-existing infrastructure automation that they already own and have the Docker hosts get quickly provisioned and added back to the pool. So, essentially, you could automate the provisioning of a new Docker host — which could get configured behind the orchestration engine — and start receiving container deployment requests. This is, of course, a no-brainer, considering many non-unicorns may already be doing it in their enterprise, with respect to provisioning host machines for various reasons.
The next layer of adoption includes tieing the CI and CD pipeline with the orchestration tools and bakery. Organizations would typically reuse their existing CI and CD infrastructure and integrate them with the orchestration tooling. Integration servers, like Jenkins, offer post-build steps that would take the ready-to-deploy builds and package them as container images. These images are then pushed to the image registry local to the project. It ends with triggering the deployment of either the entire application stack (i.e., all related container images for that application), or a partial update to the already-running state of the application — the latter being the most commonly adopted, unless there is implicit dependency between the built image and other participating running containers. Most orchestration tools offer multi-modal interfaces to integrate with the CI and CD tooling, like RESTful APIs or friendly CLI. The deployment manifest needs to use the new version of the built container images, and is then dispatched over the interface to the orchestration tools.
Orchestration tools, like Marathon-Mesos of Mesosphere, offer health checks and dependency-aware deployments that can take care of partial updates to the application topology, preventing the need to restart the entire application.
The goal for this layer of adoption is to provide a similar experience to the end developers without sacrificing ease of use and deployment habits. The integration of orchestration tools with the pre-existing toolset and deployment infrastructure offers a similar user experience to the end developers, and thereby increases their confidence in containerized deployments.
If the enterprise has already invested in a logging infrastructure, like Logstash or Flume, it becomes obvious to have the same capability extend over into the containerized environment. Thankfully, with the support of various logging drivers in Docker daemon, as well as projects like Logspout, it is relatively easy to tie in with the running Docker hosts. The logging drivers and Docker logging containers can also route the incoming logs to a centralized logging server for aggregation. Using tools, like ElasticSearch and Kibana, with the logging server, like Logstash, offers the convenience of diagnosing issues in logs, as well as a consistent dashboard to view log metrics.
The container ecosystem offers a wealth of monitoring capabilities, so this becomes a decision based on convenience, budget and overall experience. Tools, like cAdvisor, do exist and provide a wealth of information on the state of each container’s utilization, and that is a good first step to start with monitoring. Popular tools, like Datadog and New Relic, already offer native integration with Docker hosts, and are a good choice for those needing container monitoring. Apart from this, certain orchestration tools, like Mesosphere, offer metrics about running a containerized application on a Mesos cluster through Marathon API.
In conclusion, this layer bundles up the various missing pieces to put the needed infrastructure in place, and provides a consistent experience for the development team, so they can be ready for continuous delivery.
With all the necessary pieces in place, it is time to move up the layers of adoption. This next adoption step is the critical litmus test for the success of container adoption in the organization. The idea is to bring up the continuous delivery pipeline, allowing developers to rapidly integrate and deploy changes in the application across all environments, right through to the production.
One of the challenges of this layer is to be able to maintain consistency of the deployment configuration across multiple environments — from development to production. Across these diverse environments, there are usually variations in the deployment topology, the choice of network configuration and persistent data requirements. These variations offer opportunities to manage the container infrastructure in a more generic way.
The production environment demands having strong control of how the container images are built, especially considering the access credentials and keys that are needed to operate the application inside the container. The key is to avoid baking these secure credentials into the images, instead relying on an external key repository that can be accessed within the runtime instance of the container. Solutions, like Vault from Hashicorp and KeyWhiz from Square, are useful alternatives to implement key repositories for application in containers.
The choice of filesystem drivers is also an important consideration while rolling out the Docker in a production environment. The default filesystem driver AUFS is usually not the preferred choice in production, but it is still a good fit, considering the dev and test usage. New file system drivers, like overlayFS, are a better fit, but they are still not mainstream and require the latest version of kernel, something that most organizations are still far away from.
Once the non-unicorn is able to get useful experience running containers in production, it creates a domino effect for all other non-containerized applications inside the enterprise. The experience adds a layer of trust and working knowledge regarding operation of containers at scale and with critical workloads. This is essential to enable a larger adoption roadmap within the enterprise. The larger adoption requires a system more along the lines of a managed platform as a service (PaaS), which offers multiple projects the same containerization building blocks as discussed here. Titan from Netflix is one such internal PaaS that received good coverage at the last Dockercon 2015, and it essentially offers similar capabilities. The conference also included coverage Ebay, who was running a similar managed infrastructure offering on their team’s container platform.
Reaching the Pinnacle
One of the key attributes that emerges after a rollout of containers in production is the need for rolling out interesting use cases for the enterprise. This includes the ability to perform a canary release for container instances in production, and the ability to perform auto-scaling based on business events. This is achieved through the availability of tools that offer a declarative rollout of changes in the production environment by seamlessly controlling the load balancer layer. Tools like Vamp from Magnetic.IO are an interesting pick that offer Netflix-style canary release and SLA-driven auto-scaling, and provide a basis for non-unicorns to implement a use case. VAMP integrates with an existing orchestration platform, like Kubernetes and Mesos, to offer these functionalities. One of the interesting use cases that leverage such capabilities is the ability to perform A/B testing for architectural changes, like the choice of database or a new design. The changed application state in the form of a new set of containers are run in production, side-by-side with the older version of the application. This allows for quick feedback to the development team so they can iterate architectural changes by testing them in production, where it matters the most. For non-unicorns, investing time and effort to build this capability from scratch is an afterthought.
The adoption ladder — which goes from testing containers in dev and test right up to having the capability to auto-scale and perform canary release for containers — provides a basis for non-unicorn starters. This account is not intended to be prescriptive, but rather a conversation starter for non-unicorns, who want to give their best shot at building and operating hyper-agile infrastructure for their enterprise.
With more non-unicorns sharing their stories, and the upcoming Dockercon in Europe, it will become more clear how things are faring with running containers in organizations. The path to containers is not an easy one, hence it requires more than just an experimental mindset to avoid the hype and start testing it inside your enterprise.
CoreOS, Docker and New Relic are sponsor of The New Stack.
Feature image: “wild horse rainbow watering painting, scott richard (2009)” by torbakhopper is licensed under CC BY-ND 2.0.