Gareth Rushgrove is an experienced software and operations engineer, working as a product manager at Docker. Additionally, he’s the editor of one of the most widely read newsletters in the DevOps community, “DevOps Weekly.”
In this interview, I spoke with Rushgrove about his current work at Docker, what is the Cloud Native Application Bundle (CNAB), the importance of CI/CD pipelines and DevOps trends predictions for 2019 (which are mostly focused on costs).
Your job title is product manager at Docker. That puts you (and your team) directly behind the success of a product that is now the cloud software industry standard. Could you let us know what your regular day at work looks like, and what you are focusing on in terms of where Docker is heading?
I’m formally the product lead for what we call the developer solutions group at Docker. I focus predominantly on the developer uses of Docker. I don’t think there’s a normal week or day. I do everything from working with the engineering teams to make sure they’re understanding what it is they’re working on, to talking to customers a lot, because ultimately that’s the best source of feedback about how to improve what we’re doing, as well as talking to the broader Docker user community.
Obviously, we’ve got customers of Docker Enterprise, but we’ve got a whole long tail of other people using Docker for all sorts of different things. So, it’s spending time at events like this Configuration Management Camp 2019 in Ghent, near Brussels, and meeting that crowd of people. In terms of actually the things I’m working on, part of it is launching Docker Desktop Enterprise, a commercial version of the Docker Desktop products that lots of people use. Docker for Mac and Windows are super-popular — they generate a couple million active monthly users. A lot of that work is classic product planning and looking at business cases. On the other side of that is work that I and my colleagues are doing around CNAB (Cloud Native Application Bundle) — a new specification for a meta-package format for API-driven infrastructure.
Why do you think CNAB is worth developing (DX and business-wise)? Isn’t it just reinventing the wheel and adding an additional layer to the already complicated world of cloud native tooling?
CNAB came about through a couple of main routes. One of them was what we were doing internally on a tool we call Docker App. That was all really around sharing and reuse of Docker Compose-based applications. Docker Compose is super-popular, and there are hundreds of thousands of Compose files public on GitHub, and undoubtedly a lot more elsewhere. It’s just a source-based artifact, so it doesn’t have a great reuse story. You reuse things like that via copy-paste, and that’s true of lots of other things too — CloudFormation templates and code in general. So let’s have a packaged version of that! A Dockerfile is a source-based thing where you can totally share those. However most people actually use the images that you built with them — Puppet has Puppet modules, and it’s the Puppet modules people reuse; Terraform has Terraform modules and a registry. That pattern of there being a package is really common, and we were working on Docker App to really bring that to Compose. At the same time, a bunch of the folks at Microsoft (actually some of the original people who created Helm) were looking at that general pattern of packaging. In particular, they were looking at it from the point of view of multiple toolchains. People might be using various resource management templates along with Helm, so rather than saying you have multiple packaging standards, why can’t we have an approach that can deal with multiple tools?
We got together and combined the best ideas of both, of what we were doing and what they were doing, and that resulted in CNAB. For us, it’s the implementation details and tools we’re building, but actually, it’s also a lot more generally applicable. There are other tools that we and anyone else can then build on top. But it’s not trying to necessarily replace something, and it’s also not trying to solve all the problems. It’s first and foremost a specification that was announced at DockerCon and Microsoft Connect() in December 2018. We kicked off our first community meeting last week, and there will be another one tomorrow, so it’s all very new.
It’s also not intended to be a large consumer-type affair. It’s not Docker, it’s not Helm, it’s not Terraform. It’s ultimately for people who are building those types of tools, including tools that don’t exist yet. Actually, if we all agree on this these fairly low-level bits, some of which are really a schema for some metadata, a file system layout, an execution model, and then a distribution model — none of which is in any way really differentiating — it’s all about the agreement at that point. If you look at previous sets of tools (Chef, Puppet, and Terraform being good examples), all of them have some metadata that you create, all of them have a prescribed file system layout for the package and some on-disk representation, but no one who uses any of those tools can properly tell you what any of those are. If you are using several of those tools you end up with multiple registries and you can’t store it in one place, so you end up paying the cost for something that no one really cares about. CNAB is all about the agreement at a level where we can have shared tooling.
In one of your talks about cloud native developer workflows, you argue that “continuous integration is a fundamental practice of modern software development and (…) this is the thing you invest in before you get good at everything else.” Could you tell us more?
CI aims at being the glue between a lot of the other tooling that is out there. It’s not really possible to embrace infrastructure as a code with whatever tools you happen to be using without adopting automation. To a certain degree, CI/CD has become the workflow engine for infrastructure and application deployments.
In your talk from 2017 “In Praise of Slow (Continuous Delivery),” you describe the best practices in continuous delivery (or lack thereof) when it comes to enterprise software scenarios (on-prem, etc.) It’s been two years since that talk — do you think that anything has changed for the better? Or should it change in the first place?
The context of the talk was actually about ultimately adopting continuous integration and continuous delivery patterns in environments where you can’t go faster because of some actual realistic constraint. I think that the large organizations that aren’t going fast are often not constrained by anything physical — they’re constrained by what they believe to be the best approach towards shipping software.
There’s a disconnect that is often not there if you’re thinking of a world of continuous delivery to a single web property where there’s conceptually only one thing in a steady state (if it ever exists). Packaged software is quite different, as is the delivery mechanism by which you generally push that out, and it’s consumed as consumers want to. Large organizations anticipate running their own software and third-party software for as long as they can without necessarily changing anything.
Supporting things for two or three, or five or ten, years is realistic, especially as you get into environments outside of the web property business, which in large organizations is really common. We do some work with people who are deploying devices in oil fields or on cruise ships, or in industrial premises, and you’ve actually got an awful lot of software and hardware that benefits from CI/CD, but then the actual implementation of those patterns doesn’t look like the one-button deploy to a Web property.
It’s similar to measuring and metrics in these environments. If you’re used to a world where Google Analytics is a freely available service that enables you to get an amazing amount of information out of a web browser, it’s different with Open Source software or graphical desktop tools, or tools that live in the edge environments. Getting instrumentation and metrics from those is much more custom. There’s also in some cases not the same connectivity and networking. These might be temporarily disconnected, or they never have a connection to the Internet because of how they run. So how you get data from those and get through the CI/CD cycles and then evaluate them is really interesting.
You’re curating the DevOps weekly newsletter, which has tens of thousands of readers, and you’re also reviewing a lot of conference talks proposals — that and your other activities make you very close to the community with its pains and solutions. If you could try to figure out what 2019 will bring in the Infrastructure/DevOps/Cloud Native scenario, what would it be?
One of the trends I actually see a lot more people talking about is cost as a first-class citizen within the infrastructure. The spend has always been there, but it’s never been really something that was anything more than a specific topic. I think with the emergence of serverless to a certain degree (and also better workload scheduling) optimizing via cost is becoming an interesting area.
Exactly! That’s why with Semaphore 2.0 we’ve departed from the traditional flat-rate pricing based on a fixed number of boxes. Instead, it scales according to your team’s actual needs, and you won’t pay when you’re not using it. How would you name this trend of focusing on the costs of infrastructure?
We don’t need more names, but I’ve heard some folks use CostOps. It’s an area where I think we have the technology to do a bunch of new things now. That has the potential again to bring engineers and developers and other partners closer to the business.
I also think having this good cost information will allow for a whole bunch of quite powerful conversations about trade-offs.
That’s very interesting, because it actually might influence the way companies price tools. Not at an early stage, but you know somewhere in the future.
Yes, but I’m much more interested in how the transparency of that information, on often quite a very granular level, helps organizations make choices and change over time. Because actually most of the costing of infrastructure is done on a long-term upfront cost model. It’s still done based on a budget process, which might take several years in the larger organizations.
Interesting. So, it’s all about the money?
We were always paying for services and paying for our infrastructure, but having this technology to break it down to such a fine grain as workload-specific costing, and having the tooling to reason about that, opens up a whole bunch of classic DevOps conversations, where it’s people from different areas of the organization getting together; in the same way as (which would have been the answer a few years ago) security becoming much more of an issue. People have always cared about security, but the security people weren’t always in the room. I think that’s changing in lots of organizations. Will this bring the finance people into that room, to have much more interesting reasoned conversations with the technology teams? I think so. Certainly, in some cases, a lot of these things take time to make everyone happy. When it comes to what’s picking up some steam this year, it’s definitely one of the things I’m interested in.