Distributed Graph with GraphQL
In creating Netflix’s advertising-management system Monet, its marketing technology team ran into network bottlenecks when using traditional REST APIs.
Even simple pages called for data from a variety of sources, but the data retrieved was far more than what was required. By instituting a GraphQL layer, it achieved an 8x performance boost. By being able to more specifically fetch the data needed, pages that had been retrieving 10MB of data now receive about 200KB
The team also found other benefits that helped it add features more quickly, according to a blog post, including reusable abstractions and query wrappers to manage the logic for network requests.
The growing popularity of GraphQL prompted The Linux Foundation last November to set up the GraphQL Foundation to build a vendor-neutral community around the query language for APIs (application programming interfaces). Facebook developed the technology in 2012 and it was open sourced in 2015.
Apollo GraphQL has recently released Apollo Federation, an architecture for composing multiple GraphQL services, that stitches different teams’ parts into the whole, called a graph.
In building websites, there was a web server somewhere talking to a database or maybe two databases, explained Geoff Schmidt, CEO of Apollo.
“But we’re moving to a world of apps that’s much more complicated. Many different channels on many different platforms. iOS and Android, maybe Apple Watch and home assistant … It’s not one server talking to one database, they’re stitching together different microservices and databases, maybe third-party APIs. An app might have 100 pieces to it,” he said.
In addition to serving as a front-end for external services, the query language offers an easy interface for functions-as-a-service, enabling more complex business logic for serverless and microservices.
“You have to stitch together all the places that data is coming from in the cloud — all the different microservices and databases, to get the particular combination of data for that screen on that app. You wind up having an N by N problem trying to cover all the combinations of data that your app uses.”
The old way to do this was REST APIs, a 20-year-old technology now, he said.
“It was designed for a very different world. The problem with REST is that it’s point-to-point. It’s based on the idea that you’re going to pre-negotiate every combination of data that you’d need to get from the back end.
You have to build a ton of boilerplate glue code just to fetch the data for each feature and each screen in your app, he said.
“Not only does it waste developers’ time, but you’re building all this code that’s security-sensitive, performance-sensitive. It’s really hard to maintain,” he added. “Not only are you wasting a lot of time building that code, but you’re spending a lot of time coordinating between the front-end team and the backend team. The front-end team has to beg for favors to get the work done.”
Instead, it promotes the creation of a data graph, a layer that sits between the devices and the cloud, that provides a map of all your data and services in the cloud and their interrelationships. GraphQL is the query language for it.
It’s designed to replace schema stitching and solve problems such as coordination, separation of concerns and brittle gateway code.
“Beyond 10 or 20 developers, people want to federate their graph. They want to have many teams that are all building their own little part of the graph. Maybe a catalog service keeps track of products, shipping service keeps track of shipping, reviews and recommendation service…, “ he said.
Yet, like going to Amazon, you can put products from different suppliers into one shopping cart.
Clients send their requests to Apollo Gateway, which contains a distributed query planner. It breaks a query that could touch multiple services down into the pieces or sub-queries that need to go to each of those services. And it understands the interrelationships between those services. It can execute that query over all those backends, then put all those pieces together into one result for the client.
It’s like a SQL query planner would combine data from multiple tables, but rather than pulling data from multiple disk drives, it’s pulling data from multiple microservices and multiple teams that built those microservices.
Federation is based on these core principles:
- Building a graph should be declarative. You compose a graph declaratively from within your schema instead of writing imperative schema stitching code.
- Code should be separated by concern, not by types.
- The graph should be simple for clients to consume.
For complex environments, federation includes features like user-specified primary keys, explicit query plans, and flexible ways to use denormalized data across service boundaries when available.
Early users on a Spectrum discussion had a lot of questions about it, and consultant Ben Awad, in a YouTube video, wonders whether it covers every possible query.
“What does it look like if I need to customize a certain query at the gateway level?” he asks.
Apollo recently had its first million download week, totaling 25 million downloads in the past year, and just raised $22 million in its first venture capital round.
Meanwhile, FaunaDB recently added a GraphQL API and the Postman API development environment has added support as well in version 7.2.
The Linux Foundation is a sponsor of The New Stack.
Feature Image: “Pumpkin seeds – macro, no flash, plastic box stabilized #1” by I G. Public domain.