VMworld 2018: Pivotal Container Service and the Long Road to NoOps
One year after VMware’s and Pivotal’s joint introduction of their Pivotal Container Service (PKS), it’s far too soon to say that organizations have “adopted” it. But enterprises are investigating whether the commercial Kubernetes platform, available on Google Cloud Platform but also part of vSphere and also IBM Cloud Private, is robust enough to help them move their applications and services off of monoliths and into distributed systems. This panel discussion was a public disclosure of what some of these firms were learning.
At the Sound of the Tone
Initially, PKS promised to deliver a continually rolling-updated Kubernetes core into data centers automatically, along with sophisticated infrastructure monitoring that would provide a wealth of behavioral data to both network and security operators.
Since that time, for at least some, the phrase “Kubernetes dial-tone” appears to have become something of a dog whistle, suggesting that IT operations managers were not necessary to its operation, and that PKS was some manner of fully-managed service. Indeed, VMware did position PKS as a platform for a managed service.
Then last June, VMware lifted the veil on a completely separate Kubernetes platform, VMware Kubernetes Engine (VKE), which it describes as a “fully managed SaaS-based service running on AWS.” So for a while, PKS was portrayed as a boon to operators, before appearing to hold out promises of pink slips. Then came VKE, which provides the No-Ops environment that some operators feared PKS (along with other PaaS platforms) would inevitably become, repositioning the earlier offering someplace in-between the two extremes.
“Both PKS and VKE have a shared promise to our customers and to their developers. Our customers are the IT operators, and they’re here to serve their development team,” remarked Paul Fazzone, VMware’s senior vice president and general manager for cloud-native applications, in an interview with The New Stack.
“What that promise is,” Fazzone continued, “is that the Kubernetes dial-tone we provide is 100 percent vanilla, whether you’re operating it on-prem with PKS, or you’re consuming it and operating in a public cloud environment with VKE as a SaaS service. They’re effectively identical in terms of developer interaction.”
The “dial-tone” that Fazzone portrays is like an assertion of the presence of a “common carrier,” for the benefit of developers who effectively consume the service and produce applications. Back in the days when the telephone conjured images of circuit switches, wires, and endless miles of long lines, the dial tone was the only signal the user needed that the service was on, and that it was functional. That dial-tone did not render operators unnecessary. Quite the contrary: It provided service assurance that thousands of network “linemen” (and women) were doing their jobs.
Is this type of service assurance what the early adopters of PKS are bringing to fruition? During Day one of VMworld 2018 in Las Vegas, the company hosted a panel of IT representatives from companies either adopting or actively evaluating PKS. It was a surprisingly frank discussion, and although you can expect VMware would not have selected customers who had little respect for PKS, their stories spoke to the challenges of “Day 2 Operations” — of integrating the platform into their work habits and processes, now that it’s installed and fully operational.
When Microsegmentation Is Neither Micro Nor Segmented
Priority Payment Systems is based in Alpharetta, Georgia, a northern suburb of Atlanta. It’s been building a containerization platform for the better part of three years, until recently centered around Hashicorp’s Nomad scheduler and Consul service mesh. But as Jeff Levy, the company’s vice president of cloud platforms, told the audience, the firm felt it needed “an enterprise-grade container solution in place.” So it set about to relocate the work it had already done to straight Kubernetes — moreover, toward a microservices architecture that integrated functionality from its big data platform.Levy’s team has been evaluating PKS as a proof-of-concept project for his company. Because his developers had already made progress towards containerized microservices, they were well into devising container networking with both VXLAN and Weave. Although VMware has used the phrase “plain vanilla” to describe PKS’ Kubernetes implementation, it’s built to be a delivery mechanism for its own NSX network virtualization platform (specifically, with the NSX-T derivative framework for container environments). Theoretically, the system could continue using network overlays meant for “plain vanilla,” though as Levy admitted, there have been issues with trying to do so.
“We have microsegmentation in place already,” said Levy, referring to VMware’s architecture for applying network and security policy to specifically identified workloads. “One of the challenges we have in our current environment is, because we can’t do microsegmentation at the container level, we’re doing it at the virtual machine level. So we have to have these groups of VMs that we’re deploying containers to, that have similar network policies.”
It takes the “micro” out of “microsegmentation,” as well as the “distributed” out of “distributed systems”: to intentionally use VMs as corrals for containers according to the security policy that’s applicable to them. In fact, it removes the entire benefit that Kubernetes provides, of collecting containers together into pods because they work together. At the conference, CEO Pat Gelsinger officially announced his company’s “Microsegmentation 2.0” initiative, but without the specificity that many were looking for: namely, the option to apply policy at the container or pod level in place of the VM level.
It was Levy who informed the audience it is indeed possible for architects to use NSX-T to apply microsegmentation policies to designated namespaces. “That provides us with the ability to have one large pool of worker nodes,” he said, “to deploy our microservices out to, and then we just use the firewalling to segment the container workloads.”
Binding Namespaces to People Isn’t Safe
Yet that might not be enough segmentation and isolation for the Cloud Labs project at Swisscom AG, Switzerland’s premier telco. Stephan Massalt, the vice president in charge of Swisscom Cloud Labs, told the audience his firm provides cloud hosting services, especially for that country’s financial institutions.
“With the security requirements that these customers have today,” said Massalt, “we have seen that this separation of namespace — even though there are solutions in the market claiming they can do it — simply is not safe enough. There’s still a lot of issues that you will have, especially when you’re working with Kubernetes and Docker.”
Acknowledging that VMworld is certainly not a developers’ conference, Massalt remarked that, in his experience, developers only ask for reasonable access to their applications, but they’re lazy in enforcing policy for themselves. So for operators to meet developers’ requests, along with the multitude of security policies they entail, systems end up running developer sessions with containers in encrypted states. Since that encryption is tied, via TLS, to the developers’ active session, the entire node becomes bound to the developer. It’s no longer a multi-tenant node; other customers have to be redirected to other nodes, through policies that require custom Swiss hand-crafted precision.
Imagine an “Occupied” sign from an airplane lavatory bolted onto a Kubernetes node, and you get an idea of what Massalt’s team was working with.
As a service provider, he continued, Swisscom needs to offer multitenancy at the cluster level. Namespaces should be freely created inside those clusters, he advised, in order that customers can freely isolate lines of business (LOB) within their company, without having to spin each LOB into a separate cluster.
Massalt polled the audience, asking how many of them had experience with updating their Kubernetes clusters. No one, in a reasonably full ballroom, raised a hand.
“There’s a reason for this: because it’s a painful process,” he said. It’s why Swisscom had already adopted BOSH as an automated deployment tool for replacing old versions and updating the underlying platform, thus taking care of a large chunk of Day-2 operations.
Is PKS a Different World for Developers and Operators?
I asked PPS’ Levy and Swisscom’s Massalt whether their development teams had to make any measurable or meaningful strategy shifts when they began examining PKS as a viable platform for their organizations.
Levy told me that the payment processing monoliths his team had to re-architect were based in .NET on the front end, and SQL Server on the back end. “You’re talking about tens of thousands of lines of code within that monolithic application,” he said. “So it was very challenging for us to be very agile in getting new features to the application, and even doing hotfixes… We are refactoring all of those .NET applications into running as microservices. This one application may now have a few hundred pods, a few hundred microservices.”
There was no switch-throwing ceremony; the legacy environment continued to run as the new architecture was being brought in piece-at-a-time. Luckily, every business department was on-board, including the engineering team, platform team, and software development team (note that these are separate teams). “It still has the same core feature functionality,” he said. “We have thrown in a lot of new functionality, but leveraging various distributed database systems as well, getting away from SQL [Server], where you had the inherent locking and blocking, and long-running queries that are consuming resources, moving to these distributed database platforms where you don’t have that anymore. The application itself is just that much faster and more responsive — being able to deploy microservices or containers pretty much any time during the day, not having to wait until 10 o’clock at night just to deploy a hotfix.”
“You get to a point with PKS where you really don’t care about virtual machines anymore, because BOSH takes care of it,” Levy said.
Massalt described an effort where architects systematically deconstructed Swisscom’s existing platform into microservices, being careful to be completely vendor-agnostic in the process. Once you eliminate vendor polarization as a factor in re-architecture, he advised, it’s perfectly feasible to build a microservices-based application for Kubernetes out of a mainframe application — maybe not in a single container, but within a distributed environment nonetheless.
But the re-architecture can be focused on Kubernetes in general, said Massalt, not on PKS in particular. “That is really all about running microservices on these cloud native principles. PKS just makes Day 2 operations, keeping it up and running, way easier than what we have seen so far with other solutions.”
Both Massalt’s and Levy’s experiences spoke of early attempts to retrofit two relatively new technologies to work in older and more predictable ways — PPS by dumping containers sharing security policies into the same VMs, Swisscom by encrypting container sessions in such a way that the node hosting the container can’t be used by anyone else at the same time. In both situations, the new technologies were providing the biggest challenges, because they were jointly incompatible with the way their organizations expected them to work.
It’s not so much that VMware PKS provides the solution to these dilemmas, but rather that it doesn’t interfere with Kubernetes providing that solution. In a way, that’s even better news.
Pivotal and VMware are sponsors of The New Stack.
Images by Scott M Fulton III.