CNCF Adds Google’s gRPC Remote Call Project to Its ‘Big Tent’
The Cloud Native Landscape gives its viewers the impression of a vast garden teeming with a variety of projects, many of them in the same category, all coexisting harmoniously with one another. Yet the Cloud Native Computing Foundation’s move Wednesday morning to acquire stewardship of gRPC — the remote procedure call protocol for distributed systems — from its originators at Google, has resulted in the emergence of a six-tier assembly of inter-related components for distributed applications, that looks a lot more like a stack.
“We deliberately built CNCF to be a tent for cloud-native technologies,” stated Chris Aniszczyk, the CNCF’s chief operating officer, in an interview with The New Stack, “which include not only the orchestration bit — Kubernetes helps us fulfill that bit — but other things around cloud-native bits: storage, coordination, networking, and so on. All these pieces that you need to stitch together to build what we consider a cloud-native solution.”
“Essentially, gRPC is one of those pieces which helps with the microservices aspect of things, and there’s pretty tight integration between gRPC [and] the Kubernetes folks. There’s integration cooking between the open tracing community and gRPC,” Aniszczyk said.
The gRPC protocol is a means for services staged on servers to establish communication with one another, to coordinate, exchange data, and share common data sources. The “RPC” part means what it always has: The remote procedure call has always been the trickiest part of distributed computing to pull off.
Fending Off the Ghost of DCOM
Back before the web, the first attempts at distributed computing on a large scale involved architectures that involved sophisticated, centralized resources — or, in lieu of an omnipotent overlord, the replication of centralized resources across all participating servers. In the case of Microsoft’s Distributed Component Object Model (DCOM, its distributed version of Windows’ COM) in the early 1990s, there was a centralized registry, parts of which were shared among servers. It provided unique identities for specific components, and pointers to “type libraries” that catalogued all the functions that each component could perform. Developers invoked these type libraries with special #include macros in their source code, ensuring that all compiled applications were on the same page with one another.
GRPC has much the same objective today as did DCOM, and the Common Object Request Broker (CORBA) before the turn of the century. What’s missing is the element Microsoft called marshaling: an overseer of operations upon which the success of the interaction solely depends. When the gRPC project began in 2015, Google cited this absence as a key advantage, freeing system architects to develop or utilize their own choices of methodologies for messaging.
“The way I think about it is, there are two systems that need to talk to each other at a core level,” explained Varun Talwar, gRPC’s lead developer at Google (who will continue his work with the project). “They need to know what methods each other supports. So if I’m A, and I call B’s method, I need to know what B’s method is. I need to know what arguments B’s methods take, and whatever arguments they are, I need to serialize them and send them over the wire. B should understand them; B should be able to unmarshal them, accept them, and process them.”
In the old days of distributed systems, the type library would have resolved these requests locally. However, the library needed to be installed locally as well. Imagine if every city published its own telephone directory, such that the only way to call someone in that city is if you owned a copy of it yourself. Eventually, you’d have a wall full of useless phone books… which pretty much describes what happened to the early RPC systems.
GRPC resolves these issues through protocol-driven interaction. As Talwar explained it, the protocol enables services written using any language to negotiate with one another for common ground. Each negotiation leads to a strictly defined contract between components, ensuring that all exchanged data is of the expected type, and that handoffs are properly coordinated.
Issues that a distributed systems developer might normally encounter, such as circuit breaking (preventing cascading failures), flow control, request cancellation, out-of-order receipt of message sequences (a common occurrence on web fabric), are all managed under-the-hood by gRPC, said Talwar. This not only negates the need for a regimental oversight component, he acknowledged, but also the impetus to build a custom message handler with every application — which the first hyperscale applications certainly did.
“As you start to break down bigger, monolith components into microservices,” he remarked, “and you get into a world where each of the components are doing [things] in their own custom way, then overall manageability, uniform policy applications across entire systems, and an overall system view become pretty hard.”
Put another way, when microservices from disparate applications cohabit a platform, and none of these applications can bend their ears, if you will, to the same communications channel, it’s a security issue.
With gRPC, said Talwar, “you don’t have to think about, what would I do if a call gets canceled to me from the user, and what will happen to all the ten [or so] downstream service calls to ten other systems, as a result of that? For flow control, if a deadline occurred on a given request — the deadline was 20 ms, and I processed my part in 5 ms — the rest of the system knows that it has 15 ms to process the rest of it, and if it takes longer, it’s a timeout for that request. Things like that — which I won’t call ‘orchestration,’ but communication- and connection-level policies — are taken care of, and you don’t have to either answer them up-front in a regimented way or figure out your own custom ways in their own individual services.”
Seal of Approval
So are these goals of manageability, policy application, and single-pane monitoring made more feasible if we stitch together this system — to borrow a phrase from Aniszczyk — around the Kubernetes container orchestration tool at the core?
“An orchestrator is at a different level,” responded Talwar. “But in a network and communications sense, yes… Kubernetes is more for deploying and managing containers at scale. It’s a slightly higher level of abstraction.”
As an example, Talwar posited the creation of a container that hosts a gRPC service. It’s deployed on Kubernetes, but the wire-level communications between the containers in the system are handled by gRPC. Yes, it may be staged on Kubernetes, but the functionality here is completely decoupled.
The purpose of a foundation created under the auspices of the Linux Foundation is to focus on a specific set of issues and how an open source component may address them. That’s how CNCF started, focused on orchestration. But as Aniszczyk reminded me, from the time of its founding, CNCF has had as its stated goal the aim of assembling the open source components that an organization may require to build a container application infrastructure.
“CNCF is really a tent of technologies that are related to bringing cloud-native to the rest of the industry and world,” said Aniszczyk (who serves simultaneously as the executive director for the Open Containers Initiative), “whereas OCI is very narrowly focused and more of a standards body dedicated to the container format and the runtime bits. Things like networking and storage are completely out-of-scope for OCI, and that was done deliberately.”
CNCF does appear to have already assembled a fairly complete stack. Besides gRPC and Kubernetes, there’s Fluentd for amassing disparate data sources into a unified logging layer; Linkerd as the service discovery component that replaces the need for a service registry database in distributed systems; Prometheus as the service monitoring platform (the first project after Kubernetes to join CNCF, last May); and OpenTracing as its workflow propagation management component.
If you add Kubernetes’ CRI-O project to the mix, you bring container dispatch and deployment into the stack. The matter of container creation is left as an exercise for the developer.
So despite the crowded shopping mall-like appearance that the Cloud Native Landscape presents, CNCF’s stack does look more like it’s selected its preferred “anchor stores.” Except that its leaders may wish we wouldn’t use that word.
“We wouldn’t word it as ‘preferred,’” said Aniszczyk. “We would word it as one of the options. We could potentially have competing projects within CNCF. We’re not there to be kingmakers per se. We’re there to integrate and bring together projects under one umbrella that push forward this notion of cloud-native computing that companies like Netflix, Google, Twitter, Facebook, and so on have been running within their own systems for quite a long time — and try to bring that to the rest of the industry.”
The Cloud Native Computing Foundation (CNCF) is a sponsor of The New Stack.
Feature image: An engraving based on a painting entitled “The American Circus in France” by Frederic Arthur Bridgman, from the Miriam and Ira D. Wallach Division of Art, Prints and Photographs: Print Collection, of the New York Public Library.