GraphQL: Break Free of Backend-for-Frontend Sprawl

The backend-for-frontend (BFF) pattern is popular among engineering teams to abstract backend services away from clients. But as your product starts to scale to accommodate more customers and their needs across mobile clients and other platforms, the strain begins.
Suddenly, your organization has a case of “BFF sprawl” on its hands. It is entirely possible for enterprises to have dozens of BFFs they suddenly have to maintain.
BFF sprawl is a dangerous state for an engineering organization to be trapped in. There is a risk of frequent service outages as your team is stretched between building new features and addressing the ballooning tech debt. Worst case, your architecture becomes a web of BFF services with duplicate code and a backlog of interdependent maintenance tasks. All this typically happens when the company desperately needs to scale product up and out quickly to meet customer demand and drive the business to its next phase.

BFFs are effective at serving a small number of clients (left), but faced with a large set of clients with diverse requirements (right), they can quickly become difficult to build and maintain.
Self-Service Data Fetching with GraphQL
BFFs are really just REST APIs that orchestrate calls to other backend APIs and serve as gateways to highly distributed systems. Faced with a BFF layer that may expand rapidly in both size and complexity, teams often turn to open source GraphQL to manage API queries for client applications. Rather than the BFF service dictating the data contract, GraphQL provides service developers with a declarative schema definition language (SDL) to define entities and the relationship between them within a schema. Any number of client teams can use a declarative query language and a single endpoint to request only the data they need in a self-service fashion.
But as GraphQL usage grows across an organization, it can be hard to maintain in a scalable, self-service way. Companies will typically build a monolithic GraphQL API that can expand with the business’ needs, but it can become overloaded and cause performance bottlenecks that negate many of its benefits.
A web of multiple GraphQL APIs presents the same problems the BFF architecture did by becoming difficult to maintain while unnecessarily duplicating code and leading to costly overlaps in infrastructure investment. This also presents issues for security and site reliability engineering (SRE) teams, as it becomes difficult to enforce policies and monitor performance with every change to every GraphQL schema.
To avoid these pitfalls, engineering organizations need a solution that leverages the flexibility and power of GraphQL but that can still serve as a “data gateway.” The solution must allow organizations to abstract the microservice layer away from the client applications it serves. It must also be capable of serving a large number of clients with a diverse set of requirements.
Implementing GraphQL at Scale
A federated GraphQL implementation has been shown to help solve the BFF sprawl problem. A federated GraphQL API implementation replaces the BFF layer with distinct GraphQL APIs that different teams can own. The single API implementation allows requests to be routed through a composition layer that can compose a query across multiple GraphQL APIs at runtime. Less latency and improvements in flexibility are among the immediate benefits when replacing a large BFF or monolithic GraphQL layer with a single federated GraphQL API.

A single endpoint that can fetch data from any number of services
A federated approach delivers the simplicity of a monolith for client teams, but the agility of a decoupled approach for service teams. Different teams can define and maintain their own graphs without running the risk of breaking existing infrastructure. New application features can be created at the same velocity a BFF architecture allows. However, the time spent on monitoring and maintenance is reduced because the graphs are contained in a single GraphQL API.
The federated graph paradigm also encourages maintainability and service health. GraphQL’s use of evolving schemas rather than versioned API structures means you never have to worry about rolling out major releases when a client’s requirements change. The federated graph model means changes by a team to one graph place minimal load on other teams who own other graphs.
By contrast, in a BFF or a monolithic GraphQL model, any change to the BFF or graph layer has the potential to bring down service to that client, affecting multiple teams.
Less Complexity
Engineering leaders are often risk-averse to large changes to the architecture. The idea of introducing a complex federated graph architecture with potential stability implications just to serve as an abstraction layer sounds like an anti-pattern at first. But the tradeoff in complexity from adopting a federated GraphQL structure is not as large as it seems.
The reality is that for tech organizations facing BFF sprawl or a GraphQL explosion, those anti-patterns are already in effect. If your systems are highly distributed (such as in a microservice environment), your BFF layer is most likely already doing the job of federating your microservice data and composing it into a shape consumable by your clients. But because of the constraints inherent in REST APIs, it comes with a larger data contract than the client often needs — and a higher maintenance footprint in turn.
By contrast, a federated graph paradigm accomplishes the same result but with reduced infrastructure and engineering investment. This also means fewer internal debates on how best to maintain esoteric or sprawling architectures, which distract from the product-building mission, and more focus on how the technology can serve customers and drive revenue for the company.
Rather than building out a web of BFF APIs that require their own containerization, CI/CD and orchestration pipelines, the organization can build a single GraphQL API with centralized monitoring and one set of SRE resources.
Work Small to Big
Netflix serves as an example of a large organization that relies on a federated GraphQL implementation to solve its API aggregation-layer challenges. While Netflix shows what is possible, not every company can dedicate that level of technical resources to build its own ecosystem as Netflix has done. Typically, the code used to implement the GraphQL API and compose the data schemas from the graphs has to be written and owned purely by the company’s developers.
Smaller organizations and startups should thus seek solutions to make the conversion easier by providing a flexible federated GraphQL solution that can be quickly deployed and easily supported by engineering teams.
An option such as Apollo Federation can serve as an abstraction layer capable of combining different graph schemas into a “supergraph.” The platform consists of multiple “subgraphs” — the GraphQL graphs that teams would normally write. It also makes use of a “router,” which receives queries from the client and composes different subgraphs to form the appropriate data schema.
The Apollo Router offers a pre-compiled runtime that can fit into the router component of Apollo Federation. According to Apollo’s benchmarks, latencies are 10 milliseconds or less for high-traffic production environments. Both Apollo Federation and Apollo Router have been shown to provide flexibility and increased performance while reducing the costs of implementing the federated GraphQL paradigm.
A federated GraphQL platform that meets the needs of small-, medium- and large-sized organizations should allow architects to define the relationship between graphs, but without the need to spend the time writing and testing code to compose those graphs. In this way, frontend engineers can focus on writing client-level business logic and enriching the user experience.
Backend engineers can focus on writing core service logic and ensuring the product scales well instead of struggling to maintain an increasingly complicated BFF or monolithic GraphQL layer. And SREs and infrastructure teams can advance their own initiatives such as improving security, observability or taking on projects such as shifting to Kubernetes rather than worrying about the health of the BFF layer.
Ultimately, the goal of any architecture is to best serve the product by delivering value to the customer and the business. Keeping the product and tech stacks healthy by reducing complexity, limiting sprawl and carefully managing infrastructure investments is a top priority for startups whose product is beginning to scale.
Replacing an expansive BFF layer or a large and monolithic GraphQL setup with a federated structure allows the company to improve its focus on features that drive revenue. This also helps the organization to avoid having to waste resources on the painful maintenance of a top-heavy tech stack.