Running Puppet Inside Docker Containers: Useful Tool or Cool Trick?
On Thursday, Puppet announced it will become possible to run the Puppet automation service as a set of containerized services. Container images for the portable Puppet agent, Puppet Server, and the PuppetDB data warehouse will be made available beginning Thursday on Docker Hub.
With these containerized versions of components, Puppet can itself be hosted by Docker or by CoreOS, orchestrated by Kubernetes or by DC/OS, and perhaps automated by… well, by Puppet.
But as Puppet Senior Software Engineer Gareth Rushgrove in an interview with The New Stack, microservices architecture is making it feasible for infrastructure services to materialize effectively into existence, from within the very containers whose infrastructure these services would seek to automate.
“I would agree with the definition of containers as things that can come and go. But out of a number of containers comes a service,” explained Rushgrove. “And I think Docker, and containers in general, are designed for running services that don’t go away.”
Service from Within
Rushgrove reminded us that services designed to be run from within containers communicate with the outside world via APIs. Puppet is no exception here; easily an entire Puppet automation engine, or just certain necessary parts of that engine, can be instantiated, can respond to API function calls, and can quit. These container images are all identical; in fact, it’s that image which Puppet Co. is generally releasing Thursday, as part of its Enterprise 2016.2 version.
From the perspective of the resources that rely upon Puppet for provisioning, such as Puppet agents, the fact that there is no persistent, 24/7/365 Puppet server running in a data center doesn’t matter, the senior engineer continued. All that needs to persist is the data governing the active state of the various resources that constitute the data center’s infrastructure. The connections between server and client, after all, are just HTTP. Now that Docker and other container systems enable persistent containers for data, persistent server containers do not need to be instantiated to make Puppet effectively containerized.
The automation code being executed by Puppet would need to be mounted in a persistent container, however, Rushgrove, along with the data produced by Puppet agents as they “call home.”
The new Docker image option is not an effort at replacing the construction of platforms that organizations already have in place, he maintained. In other words, Puppet is not pushing microservices and containerization as a panacea for automating old applications on old infrastructure.
Rather, the option enables Puppet to be run on platforms that couldn’t support it before — for example, Red Hat’s Project Atomic, a minimalized Linux such as CoreOS, Mesosphere Marathon, Kubernetes, or even Docker Swarm. Running Puppet on Swarm, he said, would let Docker’s orchestration tool manage the load balancing and self-allocation aspects.
The goal is automating the infrastructure, so it’s been said. If an infrastructure of a data center can be automated, theoretically, then the applications that are supported by that infrastructure can be made more reliable, and more independent of the support functions of the underlying computer.
“When infrastructure is managed as code,” reads marketing literature published by Puppet (formerly Puppet Labs), “multiple teams can collaborate, and you bring proven agile software development practices to IT: versioning, peer review, automated testing and continuous delivery.”
The reason infrastructure is typically visualized as a “stack” is because there are layers of support, one layer making the next layer feasible. At the top layer of the stack in many models involving containerization is a multitude of containers, most of them intended to run ephemerally, answering requests for services and winking out of existence.
“The advantage there is, these are platforms with operating systems that don’t have a native Puppet agent,” said Rushgrove. On platforms where all running software is in a container, Puppet used to be locked out, rendering it impossible to obtain insights on how resources in these inaccessible environments were running and using resources. “By using the packaging from other containers,” he said, “we can bring those into management.
“We see lots of users who basically have an [attitude], ‘Everything runs on my hypervisor.’ And it’s reasonable to assume there are some very forward-thinking, new companies, without this automated legacy problem, where they’ll be saying, ‘All the software we’ll be running is in a container because it standardizes how we release and deploy all of our software. If it’s not in a container, we don’t run it, or we have to package it ourselves.’ In the Puppet community, there’s a lot of prior art around people doing this themselves. I was speaking last week to somebody running Puppet infrastructure on top of Rancher.”
While that anecdote was perhaps an “edge case,” Rushgrove admitted. The reason Puppet Co. wants to release this new form of packaging today — instead of waiting for when microservices is more prevalent among everyday businesses — is “to get ahead of that, if that becomes the norm — if containers become the de facto unit of software.”
The new Docker images for Puppet are being released as part of a regular refresh of the platform which also, as the company’s senior director for product marketing, Tim Zonca, told The New Stack, improves the visibility of the platform for DevOps and administrators. The key phrase Puppet is pushing is “situational awareness” — a vision of comprehending the active state of any one application or service, relative to the holistic view of the infrastructure hosting it.
It’s not a new phrase; in fact, it’s the goal professed over the past decade of products in the application performance management field — in the Dynatrace and New Relic space. With an agent reporting situational data back to a server for storage and review from a database… is Puppet now competing with APMs?
“As an operator or a DevOps engineer, one of the roles that person has is [to determine], ‘Do we understand the situation?’, ” said Zonca. “Specific to Puppet, there’s two broad categories of work that we’re doing there: One is probably similar to what you see out of the likes of APM vendors, but a different flavor. We’re looking at different things, but it’s that level of fidelity and detail that’s happening around what’s happening on your software. What changes have rolled out successfully? Where do things fault, fail, or get tripped up? What things have changed, and was that change intentional or not?”
The other situational awareness category Zonca brought up is, he said, more basic: a simpler understanding of the inventory of a system — a requirement for any organization that faces compliance issues. With more traditional operating systems and virtualization platforms, simply accessing a full software inventory requires a traversal of access trees, privileges, and rights, with the dexterity that would make any malicious user envious.
Of course, since such processes typically don’t work automatically, the entire process ends up taking place a bit more personally.
“If I want to find out what’s going on, do basic inventory on a machine, I have to figure out who owns that thing,” he said. “Then I have to send that person a note asking about the inventory on a machine. The likelihood that person is actually capable of getting it all back accurately is low. So in a few weeks, I may have an answer that I’m pretty skeptical about.”
We’ve heard this pitch before: Puppet enables a fuller disclosure not just about a server’s specifications, and its operating characteristics and conditions, but about which software resources it operates and who has responsibility for those resources.
Yet as any container developer will affirm, most of this construct assumes a kind of circa-Y2K perspective of the data center, where the client is an operating system, and the user is a security principal seeking access to an application on the server through that client. Puppet as a container image would have a very different view of what a “machine” is. And because that view would be different, it would take a multiplicity of Puppet agents running within containers in tandem, to provide collective snapshots that can be quilted together (to borrow a metaphor from my wife) to produce a clearer picture of the actual infrastructure.
In fact, that is the plan, as Puppet’s Gareth Rushgrove told us.
“Ultimately, Puppet is composed of small parts. And the individual parts are packaged up into containers. In the same way, that you might add resilience to a stack outside containers, you might run multiple instances and load-balance them. The nice thing about when you’re within one of the container orchestration systems is that they tend to have those primitives built in. That’s what’s adding value.”
Docker is a sponsor of The New Stack.
Feature image: The F. W. Woolworth building under construction in 1912 from the U.S. Library of Congress, in the public domain.