The Docker plugin system will be put to the test with a new driver plugin that merges the Flocker container networking scheme — pioneered by ClusterHQ — with a stateful, software-defined storage scheme officially launched last March by a new firm called Hedvig.
The new plugin comes with the relatively quiet release two weeks ago of Docker 1.8, which, as part of the release, developers finalized a system for Docker plugins that appears, according to some of those developers’ reports, to be stable and manageable.
The debate appeared to be put to rest last June at DockerCon in San Francisco about whether Docker containerization would eventually lead to microservices-based data centers everywhere. Stateless microservices may have their place in computing, it’s been generally decided. But interoperability requires communication, and communication requires some kind of shared context that reflects a shared agreement and understanding.
If Hedvig doesn’t yet ring a bell (no, it’s not Harry Potter’s owl), then perhaps its CEO should. While at Facebook, Avinash Lakshman co-engineered the Cassandra column-indexed database system. And before that, while at Amazon, he co-engineered Dynamo, one of the world’s first cloud-based distributed storage systems.
According to Hedvig’s vice president of marketing, Rob Whiteley, speaking with The New Stack, Lakshman “could see a clear inflection point in the market where not only is the commodity hardware that’s necessary for this equation becoming cheaper and more readily available, but customers’ appetite to procure it and deploy their own software on top of it has finally come around. Just five years ago, only the Web giants — the Amazons, Facebooks and Googles — did it.”
When it was first launched, Hedvig referred to its system as a “platform.” It’s really more of a connector, providing distributed applications with holistic access to existing storage through a software-defined route, using whatever metaphor the developer wishes to employ: block, file, or object (iSCSI, NSS, SMB, S3, Swift).
“We don’t want applications for OSes to have to conform to our view of the world,” remarks Whiteley. “We want them to be able to consume storage, and then we just want that storage to be software-defined and elastic in nature.”
The architectural issue for Hedvig was twofold: It could have created a kind of uniform API of its own, a “Tower of Babel” which could connect to any type of storage on the backend using its own singular vocabulary. But then, applications would have to be designed specifically, and perhaps exclusively, for Hedvig. Arguably, that would be nice, but it’s difficult for any company to push a self-serving lexicon as a standard in its own right.
The second side of the problem concerns the nature of containers themselves. No, unlike what their title implies, they’re not completely self-contained. Yet the original idea was for them to be detached, or at least “loosely coupled.” The microservices ideal has been that components of programs should only share the minimum data they require to fulfill their primary functions. That implies they shouldn’t be bound to large databases or storage structures outside of themselves.
In practice, that ideal works for highly mathematical, reusable functions. Yet databases are facts of modern software development. The constructs put in place to execute the microservices ideal for an application that utilizes huge data sets, are so bizarre and unwieldy that they threaten to invalidate the ideal as a whole.
One Volume, or Volume One
Docker, Inc.’s tack has been to implement its own Volume API, that would forestall any single vendors’ effort to commandeer the volume storage space by implementing a generic access language. Yes, it’s a language of its own, but because it attaches to Docker, it applies to any service that connects to data on the backend. Like many Docker open source projects, it’s a work in progress, but it has progressed significantly.
Hedvig’s tack is to attach to Docker’s Volume API. This way, it’s up to the implementer whether to deploy Hedvig on the backend. Containers can call the Volume API, and be connected to data sources through Hedvig or whatever happens to be there. Developers will be more inclined to use the Volume API for accessing databases or large data stores, and Hedvig will have the opportunity to address a larger market.
For any of this to be the least bit feasible, however, someone had to implement a reliable connection — a way for containers to be extensible and still be containers. This is where ClusterHQ enters the picture.
“The notion of having Docker designed to be pluggable, so that third-party software could connect with Docker and be managed as a first class, supportable part of the Docker infrastructure, was an idea that was already in the heads of the Docker people,” acknowledges Mark Davis, ClusterHQ’s CEO, speaking with The New Stack. “What happened, though, was figuring out how to get there, and making sure it really happened in a way that truly was pluggable, so that a third-party piece of infrastructure can become part of Docker itself, in a runtime, and behave exactly as if it had been shipped directly from Docker.”
As contrary to the commonly told story as this may seem, separate open source projects with the same basic goal have not always converged. Only with the advent of GitHub has the act of “contributing upstream” become common. The story of Docker plugins may become a historical exception.
As Davis tells the story, ClusterHQ CTO Luke Marsden and other third parties found themselves at an impromptu meeting at DockerCon Europe in Amsterdam last December. They managed to pin down Docker, Inc. CTO Solomon Hykes, and pleaded with him for relief. They were all producing their own APIs for the same purpose: reliable plugins. But plugins don’t make sense unless there’s one way to do them (just ask Mozilla). They could develop that one way themselves, but why exclude Docker from a Docker-oriented process?
“That impromptu meeting turned into a multi-hour session,” relates Davis, “getting a handful of people in the core team at Docker, Inc. and a few other folks in the industry around the table for a few hours, to decide what were the requirements, what had to happen? It was a collaborative process to get to the point of designing, implementing, and shipping an experimental version in June, and then more recently [Docker] 1.8.”
“We absolutely contributed to it, but lots of other people did as well, inside of Docker and outside.
“From our point of view,” he continues, “having the kind of cooperation and support that we got from Docker — all of Docker, including from the technical side, from Solomon — was incredibly helpful. And it was a really fruitful exercise, collaborating together, and we’re very glad that Docker did it in conjunction with its partners, rather than going off in a corner and doing it by themselves.”
Hedvig’s Whiteley describes the joint solution between his company and ClusterHQ as an open source driver, not really a co-branded product. The Hedvig platform, it’s important to note, is not open source. For Hedvig’s existing customer base, many of which need to produce stateful applications using Docker for production environments, Hedvig can now suggest a production-ready solution that does not use beta, or otherwise incomplete, components.
“We aren’t pitching this as a joint solution that a customer buys and does,” states Whiteley. “Instead, we’re just saying, if they think Hedvig’s a good storage match for your Docker environment, we point them to this to this driver, and involve our team to help with the orchestration of the volumes as well.”
It is not the philosophy-laden, metaphor-rich concept of high-minded programming that containerization appeared, at least at first, to bring forth. But it came about through an intense, community-driven cooperation that is actually more like the greater ideal of open source. This time, as one wishes he could tell Chief Dan George, the magic may have worked.
Docker is a sponsor of The New Stack.