API Management / Data / Development / Sponsored / Contributed

Decoupling Frontends and Backends with GraphQL

15 Apr 2021 7:25am, by

Anant Jhingran
Anant is the founder and CEO of StepZen — a startup with a new approach for simplifying how developers access the data they need to power digital experiences. With a career that spans IBM Fellow, CTO of IBM’s Information Management Division, CTO of Apigee, and product leader at Google Cloud, Anant has spent his career at the forefront of innovation in databases, machine learning, and APIs. At StepZen, Anant is enjoying creating a company that brings his love of these technologies together to simplify, accelerate and scale front-end development.

Frontend developers need to manage data for the apps they are building. For example, they may want to provide the delivery status of a customer’s order. So they need APIs that expose these constructs. Backend systems, on the other hand, typically expose APIs that reflect the shape of the data on the backend. For example, they might return the status of a package given its TrackingID. They are just different concerns. A good API design must acknowledge that and the implementations must code around it. A further advantage of this decoupling is that backends can come and go — a company might switch from using FedEx and UPS, to USPS and DHL — and yet how the frontend APIs are called does not have to change. The delivery status of a customer’s order only depends on the order, not on any particular shipper. The frontend developer doesn’t have to write logic to anticipate every delivery provider.

Simple as it sounds, getting this right is tricky. In this article, we argue that a design using GraphQL as the frontend API makes things easier. Also that a few other constructs — interfaces, routing, and declarative specifications — can make this decoupling even easier.

Before we get into more details, I would like to take you through some history (albeit seen through my eyes!)

This Author’s Experiences in Decoupling

Early on in my career, I worked a lot on database research and development — first as a graduate student at UC Berkeley with my advisor Prof. Mike Stonebraker (of Ingres and Postgres fame), and then at IBM on DB2. The world of databases back then was a world of abstract entities called “tables,” with the implementation hidden from the user. And yes, while it was clear that abstractions were good end-user constructs, the people who ran the databases needed core access to the implementations — setting indexes, normalizing and denormalizing, and running statistics to influence query performance. So in the world of databases, abstractions (how users interacted with the system) and implementations (how admins interacted with the system) lived side by side — serving different audiences. This concept of having two views can be called abstraction (as I have done in this paragraph), and it can also be called decoupling of frontends and backends.

A random discussion with Stuart Feldman (of “make” fame), who was my boss for some time at IBM, began with him asserting that everyone thinks abstractions are beautiful — but that if each abstraction makes performance 2x worse, then ten abstractions layered upon each other make it a 1000x worse. I know I have butchered the profound way in which he said it, but it got me thinking: is the price of abstraction always a degradation in performance?

I next encountered the concept of abstraction at Apigee, in a deep way. Early on, we worked with Dan Jacobson, who was then at Netflix. Dan wrote some profoundly influential thoughts on layering — what he called resource and experience APIs. While he did not address it in terms of abstractions, the concept was the same. APIs that reflected the domain/implementation gave a lot of power to the API user; but on the other hand, perhaps gave too much power. In contrast, “experience APIs” (for the developers building device-specific capabilities) gave the right amount of flexibility and control.

The same concept has been called “backend for frontend,” or BFF, for short. On Valentine’s day, my colleague Brian Rinaldi wrote a blog post on BFF entitled “Learn to Love Your Jamstack BFF.” His central assertion was that even in more modern APIs — like GraphQL — care must be taken to build the right abstractions.

And now, at StepZen, as we release our product that helps developers build GraphQL APIs by abstracting backend concerns, we again find ourselves in conversations about whether GraphQL’s dominant pattern should be an abstraction. To fully understand the power of abstraction (decoupling) in GraphQL, we need to understand APIs and query languages.

APIs

APIs have become the de-facto standard for how frontend developers get data from backends. But for this to work, APIs have to exist (Duh!). Which means someone has to write the APIs. This typically means some team has to write the APIs, because writing and maintaining APIs is a big task against complex backend systems. What that means is that APIs cannot change frequently.

But an API by itself is not sufficient. An API that returns customer data is incredibly useful. But if a web page displays all the outstanding orders for a logged-in customer, then the page also needs to fetch product and delivery data. So now, someone has to write the combination logic. A good architectural pattern is to avoid coding these combinations directly in the application, but push this to an API tier (this is the BFF pattern I mentioned earlier). Now you have a two-layer API system — a backend/service-centric API layer and an experience/frontend-centric API tier.

So far, so good. But there are more considerations. An API typically “exposes” all bits of information that someone might want. After all, as we discussed, APIs are difficult to change, so it is worthwhile exposing a union of bits that anyone may need. However, this makes the pieces of data that are exposed “provider” centric. The frontend developer needs a customer-centric view, because the customer is the consumer of the application. So an experience/frontend-centric API tier must also convert provider-centric data to customer-centric data.

To give an example, assume that the retailer uses UPS as its shipping company. The API call to UPS returns not only this fragment:

…but a huge amount of other information too, like all other events, “Processing at UPS Facility”, “Arrived at Facility”, “Shipper created a label, UPS has not received the package yet,” and so on. In reality, what the developer needs is these three fields:

And finally, what happens if you switch your provider from UPS to FedEx? Do you need to update all your mobile apps? That sounds like a bad idea. The frontend should have a view that is completely agnostic to the provider.

By now, we have seen that:

  • backend APIs change infrequently
  • a lot of mixing and matching of data needs to happen
  • a customer-centric view is preferred over the provider-centric view
  • backends can come and go

So a frontend API tier must almost always be present in addition to a backend tier. An abstraction layer must exist over the backends.

Query Languages

As I discussed in the first part of this article, query languages are awesome — for all the carping, the concept of “here’s what I want, give it to me, and I do not care how you get it” is pretty powerful. The most common form of this is, of course, SQL. But there are some other excellent examples around us.

Take Google search. It delivers great results, right? The computer science, the page rank algorithms, the machine learning magic — it all works against a simple 2-3 words query. One of the most significant innovations at Google was to shift the user from being precise in their asks (“Anant within 3 words of Jhingran and Title contains StepZen”) to specifying something as simple as “Anant Jhingran StepZen”, and expecting Google to do the right thing.

In both of these, query languages capture the concept of abstractions very well. One end of it (the user-facing end) communicates in a language that is easy for the user. The other end (the implementation end) communicates in the backend languages (tables, websites, search indexes, whatever).

GraphQL

GraphQL combines the best of APIs and Query Language. It is an API because a simple POST returns the data requested. And it is a query language because the user can ask for what she wants (as long as it is permissible in the definition of the GraphQL API endpoint).

GraphQL has three distinct concepts:

  1. Types (such as Customer, Order, etc.) that the user (frontend developer) interacts with. These types are linked together in a graph — for example, a customer might have orders — hence the name GraphQL. It has an additional abstraction, an interface, that can be used to further hide types. This is particularly useful when there are multiple different implementations.
  2. Queries, such as customerById (queries are just entry points into the graph) return data of a type.
  3. Resolvers, which describe the implementation of the queries and generation of the bits of data associated with types. For example, there might be a resolver that says the query customerById can be executed by issuing a SQL statement against a MySQL database, whereas the query orderByCustomer requires a GET against a REST endpoint.

The separation of types (or interfaces) and resolvers (with queries mediating between them, executing through resolvers and returning data conforming to a type) gives GraphQL a wonderful opportunity to do the right decoupling or abstraction. In fact, with interfaces, you get an additional advantage of getting the best of both worlds — staying at abstraction when needed, and diving into details when abstractions do not suffice. As an example:

This code says that if the query weather happens to return from AccuWeather, then also return description — otherwise just return the interface values.

But just because the opportunity exists does not mean that it is by default turned on. Imagine a backend MySQL database. At a simple level, it can have a customer table called “legacy_101”, an order table called “purchase_list”, a product table (“gid_index”), and a linking table that can connect orders to products (purch_x_gids”). The column names in these tables can be as meaningless as “cust_id,” “yrs_since_mat,” or whatever. A default GraphQL implementation would create a type for each table, and queries for either all columns or some subset of columns that have indexes on it. You would get gobbledygook that is meaningless for a frontend developer. And don’t get me started on the more unstructured data sources, such as NoSQL or content management systems. Sigh.

But one can use the opportunity to one’s advantage. You can do either of these two:

  • You start with the types and interfaces that the application needs to see, and the queries that the application can make. Then you write resolvers to map the types and the queries to the correct backend data. The challenge is that the resolvers might now become inefficient — every query requires a three-way join? Eek.
  • You start with the introspection of the backends — here’s what they have, and here’s the default type and query system that can be generated from it. Then you pare down and modify and map till you get a type and query system that makes sense for the frontend developer. The challenge is that if you have a new backend, the process breaks down; but when you have only one, this can work reasonably well.

I call this the chunnel strategy. At one end (say the north) is the frontend developers’ perspective. On the other end (say the south) are the backends. You dig from both sides, adjusting as you go along, till you have the best of both worlds — and meet in the middle!

Good GraphQL endpoints (and the systems used to create them) must balance the abstractions with the implementations. The language is really powerful to permit that and make it easy to do. But the developers building a GraphQL API must be cognizant of it and make conscious efforts to do the right thing. Systems and tools that push towards the balance, in the end, will be the best tools for the developers who build and the developers who consume GraphQL APIs. Frontends and backends should (and can) be decoupled, but it does not come for free.

I love abstractions. GraphQL has phenomenal capabilities — including allowing frontend developers to interact using their concepts and for the GraphQL layer to mediate with the backend implementations. Of course, just because that pattern is easy to do does not mean that you should choose to do it, or that the system you use to implement GraphQL makes it easy to do. At StepZen, we believe that these abstractions are powerful for developers and we aim to make it very easy for you to build them.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.