Docker Swarm Wins Scaling Benchmark but Don’t Take That as Gospel

How Docker Swarm and Kubernetes compare is a question increasingly posed with the advent of container-based orchestraton platforms.
It is arguable that the frameworks are different enough so that each will perform better under certain circumstances.
So it should also not be too surprising that performance benchmarks have now entered the increasingly competitive container orchestration market with a Docker-sponsored study finding that Docker Swarm appears to be better at scaling to very large workloads compared to Google Kubernetes.
In a Docker-sponsored study, “Evaluating Container Platforms at Scale,” software engineer Jeff Nickoloff found that, on average, Swarm is, on average five times faster in terms of container startup time, compared to Kubernetes.
The study questions the assumption that while Docker Swarm is a good choice for smaller orchestration workloads, truly large-scale workloads are best handled by Kubernetes, or some other orchestration framework such as Mesos. Recognizing that people may find a Docker-sponsored study to be biased, Nickoloff published all the raw data for the study, encouraging further inspection.
Nickoloff set out to empirically answer the question of how do Swarm and Kubernetes perform in real large clusters? The question is pertinent to say businesses that may want to offer large-scale, responsive, container-based services.
To date, there has not been a large-scale study comparing the performance of orchestration tools, at least not any tests that have done feature-by-feature comparisons and provided enough information to be easily reproduced, Nickoloff wrote.
Methods and Results
Nickoloff tested both Kubernetes (v1.2.0-alpha.7) and Swarm (v1.1.3-rc2) in clusters of 1,000 nodes on Amazon Web Services. Both clusters relied on etcd (v2.2.1) for the key-value database. Each node would run 30 containers, for a total workload of 30,000 containers.
The study looked at the times it took both orchestration engines to spin up a new container when their respective clusters were 10 percent, 50 percent, 90 percent, 99 percent, and 100 percent full.
While both Swarm and Kubernetes performed well when their respective clusters were less than half utilized, between 50 and 90 percent, however, “Kubernetes passes some threshold where performance degrades much quicker than it did” between 10 percent and 50 percent full,” Nickoloff noted.
The Kubernetes’ completion time at the 50th percentile was 6.45 seconds while at the 75th percentile was 28.93 seconds. In contrast, Swarm did not start to suffer until its cluster was 90 percent full.
Overall, the 99th percentile Swarm performance is between 4 and 6 times faster than Kubernetes 10th percentile performance at all tested levels, Nickoloff concluded.
“These benchmarks show that the tested operations are faster on Swarm than Kubernetes,” Nickoloff concluded. “We can infer that the reason for the difference is rooted in architecture or algorithm choice.”
Of the two systems, Kubernetes is more architecturally complicated, and so requires more effort to support, Nickoloff concluded.
Discussion
That Kubernetes is architecturally different is an important point to keep in mind, countered Kubernetes contributor and Google evangelist Kelsey Hightower, in a series of Tweets. Kubernetes is a more of a framework for distributed systems, he noted.
Kubernetes and Docker Swarm focus on different things. Kubernetes aims to build a complete, all in one, framework for distributed systems.
— Kelsey Hightower (@kelseyhightower) March 9, 2016
“Does Docker Swarm win in a few isolated benchmarks? Yep. Can you really compare the two projects? Right now the answer is no,” Hightower wrote.
Docker Inc., naturally, was pleased with the study’s outcome.
“It’s one thing to scale a cluster to 30,000 containers, and it’s a completely different thing to be able to be able to efficiently manage that environment,” wrote Docker senior technical marketing engineer Mike Coleman, in a blog post. “System responsiveness under load is critical to effective management. In a world where containers may only live for a few minutes, having a significant delay in gathering real-time insight into the state of the environment means you never really know what’s happening in your infrastructure at any particular moment in time.”
“We are not surprised by some of the conclusions in the paper,” wrote Tyler Jewell, CEO of DevOps tool vendor Codenvy, in an e-mail. Codenvy embedded Swarm into its Che on-demand developer workspace software, after evaluating a number of different container orchestration providers, including Kubernetes.
In the evaluation process, Codenvy looked at three essential criteria: latency and speed of container activation, linear container scalability on physical nodes, and a low configuration footprint, Jewell explained.
Swarm excelled in all three categories, he noted. “We can get a custom Codenvy installed into a new account in less than 10 minutes that can scale to support thousands of nodes, or hundreds of thousands of workspace containers.”
“It’s hard to see Swarm and Kubernetes as directly competitive,” Jewell said. “Kubernetes is intended to provide container orchestration in environments with a high degree of governance. That governance control comes with necessarily more complexity that will impact speed, setup and scale. Swarm doesn’t have the same objectives.”
“There are a number of factors to take into account, but we believe enterprises should focus on performance, simplicity and portability,” agreed David Messina, Docker senior vice president of marketing, in a follow-up e-mail. “Asking themselves questions like: How fast can I get containers up and running at scale? How responsive is the system when under load? What’s the learning curve to set up and ongoing burden to maintain? Will my applications seamlessly move from dev to test to production? Will I be locked into a specific data center or cloud environment?”
“I would like to learn more about how much time is being saved in the bigger schemes of thing,” wrote The New Stack analyst Lawrence Hecht, in an internal Slack channel. “Plus, I would bet, but don’t know, that tech decisions for orchestration are not being primarily driven by technology criteria.”
Hecht is leading a study for The New Stack to learn more about which orchestration engines are being used and how they are being used (Please feel free to participate here).
The release of the study certainly is timely, given that Thursday the non-profit Cloud Native Computing Foundation (CNCF) , has formally assumed control of the Kubernetes project, with Google transferring software’s Technical Oversight Committee (TOC) to the Foundation.
The New Stack Editor-in-Chief Alex Williams contributed to this article.
Docker is a sponsor of The New Stack.