Juniper Networks: Should Workloads and Infrastructure Be Managed with the Same Platform?
“The problem with OpenStack has been that it has focused too much on the infrastructure,” declared Chandan Dutta Chowdhury, an engineer with Juniper Networks, at the outset of his Tuesday morning session at OpenStack Summit in Boston a few weeks ago.
It’s like saying the problem with Shakespeare was his intense attention to language. Yet Chowdhury does cut to the quick of the main debate among OpenStack practitioners, in enterprises, the public sector, and smaller businesses: Should the job of managing workloads and that of managing infrastructure be the same job?
“If you look at applications,” Chowdhury continued, “applications are more tuned towards thinking about what kinds of resources they’ll have. They think in terms of memory, CPU usage, storage, network bandwidth allocation… while all of these features are actually controlled by the infrastructure provider. OpenStack can provide you APIs for controlling all this infrastructure. So we see there is a gap between what the application is actually wanting to use, and what OpenStack is providing.”
Enter App Infra
An application is capable of requesting resources from the infrastructure layer, and Chowdhury [pictured right] acknowledges the presence of a reliable API. But that’s not the same as communication, he argued. Suppose an application is capable of dynamically requesting resources from OpenStack as it needs them. This application could periodically report on its resource usage, through sets of statistics streamed directly to an OpenStack component.
A solution Chowdhury suggests is a kind of communication component he calls the Application Infrastructure API, or App Infra. Such a system would refocus OpenStack, he said, toward the more application-centric mindset adopted by containerization systems.
“Let the application developer himself request you for more resources, or for what the application finds is lacking, instead of a third party monitoring the application as a process, and asking for those resources,” he suggested.
OpenStack has earned a reputation as a platform for reliably provisioning and managing data center infrastructure. But there’s an ongoing dispute over how easy that job has been. Developers, argues Chowdhury, have the most difficulty engineering their applications to reliably consume resource properly in a distributed systems infrastructure.
In such an environment, he points out, a running application should be capable of producing a running profile of its consumption. It’s not really an automatic process — not like a process monitor that oversees the application from above, and renders predictions on what will happen next. Rather, Chowdhury’s idea relies on developers making judicious use of the API he envisions to provision the infrastructure resources they need — including in a microservices environment, where multiple instances of an application may co-exist.
It’s an idea that would shift the burden of administering resources from an administrative platform to the application itself, or upon some outside orchestrator, or a third-party performance monitor.
It might be another revolutionary idea from Juniper Networks. That is, if Juniper Networks were behind it.
“There’s not a one-size-fits-all,” stated Scott Sneddon, Juniper’s senior director for software-defined networking (SDN) and cloud, in an interview with The New Stack.
Having been a principal solutions architect with Nuage Networks, and before then the chief solutions architect for Brocade’s Vyatta SDN, Sneddon is one of the world’s most accomplished engineers in software-defined networking. It’s SDN that makes it feasible for virtual components to be virtually networked along a virtual infrastructure. The system that OpenStack thinks it has assembled is hooked together through virtual addresses and overlays made possible by SDN. Like it or not, the network makes the stack possible.
But up until now, the network has been the purview of folks who deal with Juniper, or with companies in its field like Brocade and Cisco, every day. Juniper’s Chowdhury sees an opportunity to expand that purview into the realm of developers. And Juniper’s Sneddon sees the limits to that approach.
“We see this in a gaming developer that wants to take advantage of serverless; we see this in a Netflix that wants to build a massively distributed system that has CDN [content delivery networking] involved,” said Sneddon. “There are different requirements for all of these apps. So that model that he’s describing might be really interesting in an NFV environment, maybe a trading environment, where the performance, latency, and I/O of that specific application is key to the delivery of that application.
“Whereas a Web developer who is developing the back end for the next Words with Friends or Farmville, or something, might have a different requirement for latency and throughput, and might be able to build scale to work around infrastructure limitations. And that’s one of the promises of Kubernetes: Whatever analytics platform I’m using, if it detects that there’s an anomaly in the performance, it can just auto-scale to more servers, assuming I have a limitless supply of resources. Which is the opposite approach to having visibility into the knobs that are available on the infrastructure, and taking advantage of them at runtime.”
I asked Chowdhury to explain the advantages, as he perceived them, of the App Infra approach to dynamically managing resource consumption as it happens, versus the Kubernetes approach of relying upon an orchestrator. After all, from the very beginning, container architecture was supposed to be about making user-facing code independent from the systems that run it.
“If you want to monitor an application from an external point of view, an external agent can work for a lot of applications,” he responded. “It is very generic. It can look at the attributes of a process, and come up with some values for how much resources the application uses.”
Compare that generic report, Chowdhury continued, with what an application designer is capable of building for herself — what he describes as a dynamic performance matrix — and that generic report will never measure up. But how would such a developer understand the infrastructural requirements, if that infrastructure has historically been managed by someone else? Will she know what to do?
“You can look at it at a different point,” he responded. “One way of looking at is, yeah, if you are using an API agent, you are handing it off to a third party, who might be specialized in monitoring applications. But if you want to squeeze out the maximum from your platform, use the elastic platform to your advantage, then giving the application developer a way to actually communicate with the infrastructure, gives the application developer a lot of power.”
Is “a lot of power” anything that a network administrator — anywhere in the world — would want an application developer to have?
“I think the developer who wants to be aware of that much detail when they deploy,” said Juniper’s Sneddon, “is the exception to the norm. I don’t think that most developers want to tune.
“I feel that Chandan’s idea here — ultimately, the user of that is probably going to be some A.I., machine learning-type application,” Sneddon said at one point. “You want the infrastructure to give every last widget and knob of information to a machine learning platform, that is then going to abstract that.”
Sure, organizations want choice, admit both Sneddon and Chowdhury. But it’s critically important that they make the right one — and from each of their perspectives, there’s a different outcome. If the infrastructure is open to dynamic provisioning, and OpenStack stays one stack instead of — as has been considered before, and rejected each time — several, then Chowdhury’s forecast will happen: the developer will have the power. But that power may be useless or unused in an organization where the network administrator still has both those skills and that job.
Sneddon perceives a class of customer that will always need more deterministic performance, citing the partnership of Intel and NASDAQ as one use case. He calls that group a “corner class” — occupying a large corner of the market, but a corner nonetheless.
The Risk of Failure
Is there a way to reliably partition system architectures for the classes of customers for which they’re best suited? Not really, explains Jason Venner, vice president of architecture and technical marketing for Juniper Networks.
“I think what we’ll see, actually, is the PaaS layer evolving so that you can get either the direct contact you need by having full control, for those people who need it,” Venner told me during an OpenStack Summit session, “or providing tight guide rails for people who don’t have the time or expertise, or the need for it. The average developer probably doesn’t, but if you’re doing some kind of deep learning application that requires GPU access, you want a lot more control over how things are done, than if you’re doing another ad network.”
The goal of a PaaS platform, he continued, is to deliver resources efficiently and securely for people and organizations that would rather not manage efficiency or security directly. In a later discussion with me and a handful of other attendees, Venner said he’d observed that organizations that rely on veteran developers to guide their architecture, have moved or are moving to Kubernetes. The other side of the market, in his opinion, is a group that just wants the architecture problem solved, done, and out of the way. And that group, he observed, has moved or is moving to Cloud Foundry.
“I’ve been through the cloud journey maybe seven, eight years,” said Venner at one point. “And it’s still evolving. Really, it’s about finding ways to manage your risk while moving faster. I spent a lot of time in the financial sector. I think about a lot of things as risk management. Most of our current practices are about, how do we manage the risk of failure? We build byzantine, complex processes that take inordinate amounts of time, and cost a lot of money, as a way of reducing the risk of catastrophic failure. And what we see is that the lean model of continuous delivery in DevOps provides another way to manage risk, that lets us see the results of what we’re doing much quicker, and reduce the overall cost of failure.”
Put another way, architects have a way of building systems that are so systematic that it seems impossible that they could fail in their entirety, all at once. Simplicity gives one a clearer picture of what the risks are… and perhaps that’s the problem that drives architects into building complexity.
TNS Research Analyst Lawrence Hecht contributed to this article.
Cloud Foundry is a sponsor of The New Stack.