Back when I started learning front-end web development and first encountered the term, “API” sounded like it should be some kind of craft beer. Then I found out the hard way that, if you go to a bar and try to order an API, the bartender will throw a “404: Resource not found” error. Oh ho ho, I crack myself up.
So! APIs are something we all use all the time. Despite their ubiquity, though, many people — even in tech — have a pretty vague understanding of what they are and how they work. Seriously: ask your coworkers to quickly explain APIs. If they can do more than basically tell you, “API stands for Application Programming Interface. It is the interface that allows software application programs to communicate with one another…” I will personally buy this person a beer. Because, seriously, most of us really can’t explain it effectively, if at all.
Let’s change this.
“A” Is for Application.
The first part of the term is heavily context-dependent. Depending on the specific use case, “application” can actually refer to many things: the whole server. An entire application itself, and the data it requires. Or just a small part of the app.
So let’s look at these contexts: server, entire app, smaller subsection of app. We begin by envisioning the web as an immense global network of connected servers. Every single page on the internet is stored somewhere on one of these remote servers — i.e., a remotely located computer that is optimized to process requests to serve up this particular website to your browser.
Thus, if you type www.github.com into your browser bar, Chrome (or Firefox or Safari yadda yadda) sends a request to GitHub’s server, which politely sends back all the code necessary to display the page and its content on your local computer. When your browser receives this response, it interprets the code and displays the page.
Server as API: To your browser (also known as the client), GitHub’s server is an API. This means that every time you visit a page on the web, you interact with some remote server’s API. An API in this context isn’t the same thing as the remote server. Rather, it is the part of the server that receives requests and sends responses.
Entire Application as API: On initial call, the GitHub server sends the whole web app: the presentational structure (the layout of the site, how it looks), plus all the content of the website. The presentational part is pretty set, and gets sent as HTML code, which is rendered by the browser. The content — dynamic information contained in the website — gets sent as data, often in JSON format, which is then rendered in the appropriate locations on the page.
So if we are looking at a typical GitHub page, the presentational parts — like the nav bar at the top, user photo and bio on the left, pinned repositories in the middle — these stay pretty much the same. But the box with those little green squares representing daily GitHub activity levels? That changes according to the user’s contributions. When we push project work to GitHub and then go check to make sure it has been credited on our profile page, the API is what tells our browser to color today’s square green, and exactly what shade it should use. Everything else, though, stays the same.
Different kinds of APIs allow our browser to make a call for specific kinds of information and update just that relevant bit of data — without also needing to reload all the other stuff, which has not changed.
Parts of the App as API: When building a web app, it is much faster and easier (and often more reliable) to construct it from pre-existing pieces. If you can imagine it, chances are there is a library, prebuilt platform, or *-As-A-Service to provide it.
But how do these components talk to each other, in order to perform together to as one united app?
“P” is for Protocol, “I” is for Interface
The application end of an API can vary greatly, but no matter what context we are talking about, the ultimate job of the API remains the same: communication and coordination.
The “P” in API, or Protocol, refers to setting agreed-upon methods for other software to talk to a given API and request/receive the relevant information from it.
Interface refers to the intermediary aspect of the API, as the actual functionality that enables two applications to talk to each other.
So, fundamentally, an API can be thought of as a kind of agreement or contract between two pieces of software, the “glue layer” that enables them to interface and work together. In essence, the API is saying, “If you give me this instruction, I will perform this action/return this information.”
Metaphor time: An API is like the beer taps at a microbrewery. Each tap corresponds to a particular type of brew, so when you pull the tap handle marked “Porter” you know your glass will fill with a hefty brown barley-scented beer, and the “Pilsner” will bring a light and almost crisp yellow brew. In the same way, clients requesting API output know which data “tap” to tap in order to obtain the desired output. Porter, for example, is expected to come out of the porter tap, not stout or lager. Meanwhile, the user doesn’t even need to know or care what happens inside the tap works. You can rearrange the lines or optimize your product (the brew or app you’re serving) without affecting your users, because the interface remains the same.
APIs don’t only push out data; they also accept it. And here is where our beer metaphor doesn’t work because beer only flows one way. So let’s mix our metaphors to illustrate how data goes into an API. Envision one of those shape sorter toys for young children. Pieces shaped like circles, stars and triangles get inserted through the appropriately shaped opening; a star can only go in through the star-shaped hole. In an API, data is served in a defined form (imagine such as circles or triangles) and can only fit through the interface through the corresponding opening. The API expects a certain format, and will reject data that does not fit. Do not try to put the triangle data into the square hole. Thus, clients are compelled to organize the inputs according to the API builder’s specifications (i.e., the protocol) which sets the expectations for the transaction.
No matter what metaphor we use to explain it, an API can be thought of as an agreement, or contract, between two pieces of software saying: “If you give me this instruction, formatted in this way, I will perform this specified action or return this information.”
APIs as a Product
Beyond being the vector for information exchange between browser, server, software and databases, APIs can also be packaged and sold as a product. For example, Weather Underground sells access to its weather data API. This comes as a set of dedicated URLs that return pure data responses, in this case, up-to-the-minute weather forecasts, for you to link to and use to furnish data within your own application. Just the facts, ma’am: you are not getting the presentational stuff Weather Underground uses in its own app or on its website. You get to build your own graphical user interface yourself.
That said, you can absolutely make requests to an API with your browser and view the resulting data, GUI or no GUI. Since the actual HTTP transmission of the requested data occurs as text, your browser will typically be able to render the response. For example, you can access GitHub’s API directly with your browser without even needing an access token. Here’s the JSON response you get when you visit a GitHub user’s API route in your browser — let’s take a look at mine:
"name": "Michelle Gienow",
"location": "Baltimore, MD",
"bio": "Front-end web developer & recovering journalist - I write web dev/JS/Node/Python @TheNewStack",
So when you visit my GitHub page, it makes a call to the GitHub API for the presentational code (HTML/CSS) to display the page, and another call to a different GitHub API for obtaining the data the is unique to me. There are several other calls to other APIs for content in other areas of the page. The browser receives all of these, then knows to plug the data into the page to generate the final, unified presentation.
Putting It All Together
Basically, any piece of software that can be distinctly separated out from its runtime environment can become the “A” in API for that particular context. It will, itself, probably also have some sort of API. For example, say you’re using a third-party library in your code. Once incorporated into your code, that library becomes a permanent part of your overall app. Being a distinct piece of software, however, the library uses an API (which comes pre-packaged along with it, no worries) to allow interaction with the rest of your code.
So: APIs can be just about anything, really. Servers, applications, even products to be bought and sold. Which is why they are kind of difficult to explain, even for those of us who work with them every day.
Perhaps the most apt way to define the essence of APIs would be using Legos. Lego bricks have a universal way of connecting to each other through a system of small pegs on one block and correlating indentations for these to fit into on other blocks. This provides a simple and structured way to allow all pieces to snap together, all in the same way. At the same time, the possible combinations of pieces are endless. Similarly, software can use APIs to connect the information we seek, the interface for viewing it, to create a unique combination of services which, together, form an application.
Legos are indeed a helpful way to understand the value of APIs for developers. With APIs, developers don’t have to start from scratch every time they write a new program. They no longer have to build a core application that tries to do everything. Instead, they can contract out certain responsibilities by using already created pieces that do the job better. So APIs are the Lego bricks of software development: standardized tools for software to communicate with other software, leading to faster building and deployment. And, thankfully, faster load times for everybody.
Feature image by Michael Kulesca, Creative Commons license from Behance.net