Wrangle API Sprawl with a Resilient Platform
If you haven’t heard much about API sprawl, an uncontrolled proliferation of APIs and services, you will in 2024. According to a recent F5 Office of the CTO Report, API sprawl happens when APIs become widely distributed without a holistic strategy that includes governance and best practices. With the rise of AI-driven experiences, we’ll only see the number of APIs explode further. Marco Palladino, CTO and co-founder of Kong, recently wrote:
“When AI interacts with the world, it does so with APIs, which will fundamentally drive an exponential increase in the number of APIs as more and more products and services want to enable themselves to be consumed by AI.”
If left unchecked, this exponential increase in APIs created will inevitably lead to a dangerous form of technical debt. We’re all accustomed to buying and paying down tech debt. After all, it’s like any other form of debt: We try to take it on for a specific reason. Maybe we need to meet a market-driven deadline or we have a surprise outage that we need to address, but for one reason or another, we find ourselves not paying down that tech debt for a few months or even years.
API sprawl is a hazardous form of technical debt. It has hard costs from hosting and scaling a fleet of APIs, real security risks and an increased operational burden from the sheer number of APIs that platform engineering teams need to manage. At its core, however, API sprawl is a result of a platform that isn’t resilient.
How do you know if you have a resilient API platform? Here’s an easy litmus test. If someone approached you tomorrow and asked what it would take to support a new customer experience, would you have a pit in your stomach? Would your mind immediately race to all the upstream dependencies, the BFFs your team already manages or the process for deploying a new set of APIs? If so, your API platform may not be resilient.
When defining the characteristics of a resilient API strategy, we can approach the problem from the perspective of the pressure we plan to put our platform under and then group them by similar attributes. For instance:
- How might client-side teams be informed about changes to APIs?
- How will the server-side teams collaborate with client-side teams?
- How might we support and manage different types of APIs simultaneously?
- How will we prevent accidental breaking changes when a server-side dependency changes?
- How might we add new presentation clients without deploying a new experience API?
- How might we use our APIs more efficiently?
- How might we fully understand the data in our system and who is using it?
- How might we apply a principled approach to the services we deploy?
Taken collectively, we can group these attributes into four key characteristics that contribute to resiliency in our API platforms: rapid self-service, insulating layers, magnification of existing investments and strong governance. Designing and building our API platforms with these characteristics is a strategic way to manage the API sprawl we already face and head it off before it worsens.
We build APIs for one purpose: to be used. When asked how they define the success of their platform, 58% of respondents to the 2023 State of APIs report said that they measured success by usage. With the pressure to increase who is using our APIs at the center of our platform strategies, we need ways to make our APIs simple to both discover and implement.
The easier it is for frontend and product teams to discover available services on their own, the faster they can design and ship new experiences, and the better they can ask informed questions about potential gaps in functionality.
Nothing we ship is perfect the first time. Even if it’s close, the technology, teams and business surrounding it shift and evolve perpetually. Whether planning for a new presentation experience, charting a major version upgrade of an underlying system of record or responding to a common vulnerabilities and exposures (CVE) incident, we need a way to plan for turbulence in different layers of our stack and devise a strategy for accommodating change without disrupting other teams.
With updates to our services and presentation experiences imminent, we can create a strategy for mitigating the risks of those technical migrations by seeking an insulating layer that combines the benefits of loose coupling with loose contracts.
Magnify Existing Investments
APIs are an investment and a commitment. Beyond the initial cost to design and develop a service, there are years of maintenance, upgrades and the internal lock-in of many teams having production contracts with any given service. We must find ways to make our existing investments in the services and APIs we already have even more valuable. Throwing away what we already have in favor of a rewrite just isn’t practical.
Whether it’s by finding new frontend experiences to implement the services we have today or identifying service-level areas for reuse, considering ways to magnify our existing investments is a crucial characteristic of a resilient API strategy.
When considering governance for our API platforms, we can divide the discussion into two primary areas of concern: data governance and service governance. Data governance is concerned with answering questions like, “Who has access to PII?” and “What experiences are implementing the ‘Users’ service?” On the other hand, service governance answers questions such as, “What is our policy for controlling API sprawl?” and “What is our process for identifying and mitigating Zombie APIs?”
Any API strategy we implement must consider the data and service governance required to secure the future of our APIs without compromising availability, velocity and reliability.
Designing a Resilient API Strategy
As with most architectural decisions, there’s no single way to design a strategy for our APIs that perfectly balances flexibility with stability. In the past, we’ve tried managing this complexity with tactics like client-side orchestration or backends-for-frontends (BFFs). Unfortunately, these have led to the API sprawl we see today, burdening teams with an operational weight that only threatens to grow.
This API expansion, whether they’re BFFs, experience APIs or bespoke microservices, is a symptom of a rigid platform that can’t sustainably support new requests from the business or client teams. In searching for more resilient API platforms that can expand and contract sustainably, organizations worldwide, including Wayfair, Volvo and Netflix, are turning to GraphQL. Teams at these organizations have recognized that GraphQL is more than just another API. With advancements in architectural patterns, like federated GraphQL, it’s a way to codify a middle layer in their tech stacks between the presentation and service layers that can drive resiliency.
Federated GraphQL enables teams to deliver GraphQL’s benefits at a greater scale, transforming GraphQL from just another API to a layer in a stack that sits on top of existing services. This graph of graphs provides access to any number of services with a single endpoint. It also enables teams to share entities and domain models across those subgraphs.
Rather than exposing a sprawl of backends-for-frontends (BFFs) or experience APIs, federated GraphQL gives service teams a central platform to contribute any services to the graph, driving API resilience and a clear path to managing API sprawl.