How Platform Ops Teams Should Think About API Strategy
This blog is part of a four-part series.
- Creating an API-First Culture and Company (Part 1)
- Creating an API-First Culture and Company (Part 2)
- Manage Your APIs Like a Four-Star Restaurant (Part 3)
- How Platform Ops Teams Should Think About API Strategy (this post)
A Platform Ops team is responsible for choosing the tools and solutions for developers and applications teams. Increasingly, they also play a role in managing and selecting APIs and building the API strategy for use of modern platforms such as Kubernetes.
So what are the key capabilities that a Platform Ops team needs to bake into the technology supporting their APIs and API strategy? Gartner lists five crucial attributes: granularity, separation/isolation, distribution, connectivity and operations. We believe there are three more: security, speed and composability. Some of these capabilities are best delivered by an API management (APIM) solution and developer portal. Others are best provided via API gateways, ingress controllers or application delivery controllers, depending on the operating environment and the orchestration technologies.
To be honest, all of these capabilities are interdependent to a greater or lesser degree, and a strategy for one affects multiple others. Further, the Platform Ops team needs to collaborate closely with the security operations (SecOps), network operations (NetOps) and developer operations (DevOps) teams to deliver a full solution. Here’s a breakdown of the magnificent eight and how to think about building a strategy for each one.
Granularity defines the functionality of APIs and how many tasks they are designed to perform. API granularity is a hot-button issue and one that Platform Ops teams can help resolve or clarify. At issue is the fundamental trade-off between complexity and comprehensiveness: whether it’s better for APIs to be as simple and granular as possible (performing one primary function), or compound (performing multiple essential but closely related functions). Fans of REST tend to prefer simple APIs, while supporters of GraphQL prefer the compound type.
Traditionally, most developers have stacked several very basic REST APIs together to compose the necessary functions of a service. GraphQL, on the other hand, encourages API definitions that incorporate a broader range of functionality as a way to reduce round trips and latency, at the expense of more complicated API structure and documentation. As the arbiters of your API strategy, the Platform Ops team needs to have a strong opinion on the level of API granularity that is required and acceptable. API granularity considerations become more urgent as you start adopting Kubernetes, which uses APIs for all communication.
Separation is a key element of resilience. APIs must be largely independent of one another so that failures in one API minimally disrupt the others. In addition, separation makes it easier to switch, deprecate and update API versions. Platform Ops teams should coordinate with the API team to enforce the separation of APIs at both Layer 4 and Layer 7, and allocate the proper infrastructure and resources to match each API’s performance needs. This might mean, for example, adding a highly critical API to a failover cluster or round-robining APIs to different data-center locations if application traffic follows the sun.
Distribution and Discoverability
If you hope to create a self-service API culture, the way you distribute your APIs for consumption is important. To a certain degree, distributing APIs simply means publishing them in your API management solution or catalog, the workspace and toolbelt your organization adopts for building APIs. Distribution may also mean creating reference designs with API recommendations based on the application or service need.
Platform Ops teams can help API teams define how APIs are created and distributed. For example, new versions of APIs may need to be deployed in staging and development environments for software in the development pipeline to take advantage of new features. To efficiently manage and secure API traffic for these APIs, you need to deploy API gateways in those environments as well as others such as public cloud and on-premises. The way APIs are published, cataloged and distributed can strongly affect the way developers structure their applications and call resources when building them. The Platform Ops team must create sensible publishing and distribution policies and practices that allow developers to move quickly, but not rashly.
Related to distribution, every API published and in use in production must be programmatically discoverable. Full stop. This is an ironclad rule for a number of reasons. First, the security team needs to know about every API in order to ensure it is protected and not being abused. If an API is not easily discoverable, but some malicious actors manage to discover it anyway, it might be a long time before you realize something is amiss. Discoverability is only the first step of distribution. Automated discoverability combined with distribution is a crucial element of version control and is often the best path to rapid updating of APIs.
Rules and policies that control how APIs can connect with third parties and internally are a critical foundation of modern apps. At a high level, connectivity policies dictate the terms of engagement between APIs and their consumers. At a more granular level, Platform Ops teams need to ensure that APIs can meet service-level agreements and respond to requests quickly across a distributed environment.
At the same time, connectivity overlaps with security: API connectivity rules are essential to ensure that data isn’t lost or leaked, business logic is not abused and brute-force account takeover attacks cannot target APIs. This is the domain of the API gateway. Unfortunately, most API gateways are designed primarily for north-south traffic. East-west traffic policies and rules are equally critical because in modern cloud native applications, there’s actually far more east-west traffic among internal APIs and microservices than north-south traffic to and from external customers.
Aspects of the operations capability overlap with most of the other strategic capabilities we’re discussing. That said, Platform Ops should consider itself the chief operating officer of all things platform, including APIs. Someone has to make sure that the required infrastructure and resources are running, updated, secure, high-performance and sufficient to meet API and application requirements.
The team that runs the API gateway or API management solution often takes primary responsibility, but the Platform Ops team is the ultimate provider of infrastructure and resources for cloud native approaches to applications and IT. So Platform Ops must develop an API strategy and resource plan with the operators of API-specific infrastructure.
Because APIs are often both single points of failure and one of the most critical elements of applications, Platform Ops needs to treat the most critical APIs as crown jewels and engineer resilience and high availability. This requires analysis of which APIs are the most critical, what type of traffic and load they are expected to bear, and what steps to take to ensure that APIs are available and operating well when needed.
For cloud native environments, APIs are both gateways to the external world and logical pathways for horizontal traversal inside applications and sensitive environments, making API security crucial both externally and internally. No surprise, then, that as APIs have proliferated and become more deeply embedded in corporate infrastructure over the past five years, attacks against them have risen dramatically as well.
API security practices, such as preventing DoS attacks, may apply equally to internal APIs, even if the DoS is inadvertent due to a misconfiguration (for example, setting retry or keep-alive parameters that generate a barrage of traffic).
Platform Ops teams need to work with SecOps teams, including Red and Blue Teams, to create threat models for all APIs and map the proper levels of control observability and protections on a per-API basis. Here is where API discovery plays a critical role; it can ensure there are no unprotected APIs and no surprises.
Web application firewalls (WAFs) also play a key role in eliminating threats. API observability and metrics tooling can be a collaboration between the Platform Ops, SecOps, NetOps, and DevOps teams. In reality, each team needs to understand API traffic to do its job well.
Speed and Agility
One of the goals of Platform Ops is to help developers move faster and ship code more quickly. That extends to making it easier for developers to create, secure, publish, maintain, monitor and version APIs. APIM, API gateways and API catalogs were all created for that purpose.
The Platform Ops team must translate this need for speed and agility into a wide array of practical measures. These may include creating known paths for new APIs to be quickly vetted and registered with global firewall and load balancers in large enterprises, enabling new APIs to receive rapid threat modeling from security teams and making it easy for API publishers to access the CPU, memory and networking resources required to maintain low latency for their APIs.
As the coordinator responsible for making APIs fast and agile, Platform Ops teams need to oversee a comprehensive checklist to ensure that API creators can get what they need fast.
Closely related to separation and granularity, composability speaks to ease of building compound applications and services from existing components. Without it, application teams are forced to replicate functionality and reinvent the wheel. The hallmark of cloud native and microservices design is the Unix ethos of composing solutions from many smaller tools that are good at their specific jobs. APIs are the superglue that makes composability effective by enabling components and modules to easily communicate and connect, even if they are written in different languages.
For the most part, APIM solutions help enforce composability by generating API reference designs and specifications to ensure API compatibility. Platform Ops helps provide painless and easily scalable composability by ensuring that APIM maps neatly to application development practices with a consistent approach. Consistency is the key to composability and compatibility, which translates into other benefits like agility and security.
Conclusion: Platform Ops Is API Ops
If you haven’t guessed by now, it’s pretty clear that Platforms Ops is API Ops to some degree. Without a solid API strategy shepherded by Platform Ops, an organization’s development efforts can degenerate into a mess of partially composable software modules linked by partially compatible APIs with partially compatible security and networking policies.
Platform Ops teams that want to improve the API experience for their organizations can look at how the eight capabilities discussed here overlap and relate, then construct complementary policies and approaches without being overly permissive or dangerous.
APIs are the superglue of the platform approach. Platform Ops teams that can collaborate to create and execute on an intelligent API strategy will make the lives of all the other teams — Dev, DevOps, SecOps, NetOps — easier, safer and more productive.