Webhooks Provide an Efficient Alternative to API Polling
Popular websites like GitHub and Slack don’t just offer webhooks because they are a convenient way to inform their users of new events, webhooks are also a way for these sites to maintain stability.
At OSCON Europe last week, Lorna Mitchell, a developer advocate with IBM Cloud Data Services, provided some insight into “The wonderful world of webhooks,” which provide a way to trigger an action within a website or web app by way of a user-defined HTTP callback.
Before webhooks, developers probably would have written code that hits a site’s API every few minutes to check for changes. Multiply this by a million or so users over millions of repositories, and the polling method of checking for changes would create a performance nightmare for popular sites, like GitHub. Enter webhooks as an elegant solution for handling all of the many notifications that someone might want to receive, Mitchell explained.
Webhooks make things happen, she explained; they allow us to exchange data between systems, components within a system, or microservices — anything attached via HTTP. Webhooks are used for slack notifications, continuous integration services, and other integrations.
Mitchell described webhooks as “key building blocks of modern applications, allowing systems to exchange data in response to events.”
APIs vs. Webhooks
Mitchell talked about the difference between webhooks and APIs. When using an API to get data from a server, the client requests the data and the server sends it back. The client has no idea if there is any new data or the status of the information on the server until it makes the request.
Webhooks are the opposite. The server knows what information the client needs, and it sends it to the client as soon as it has some new data. The client then sends back a “thanks :)” also known as an HTTP “200 OK” acknowledgment to let the server know that the request was received and that there is no need to try to send it again.
When you think about this over time, Mitchell demonstrates that webhooks can be quite efficient when compared to using APIs for a similar task. With a polling method of checking an API for changes, you might make many requests that return no changes for a particular event. With webhooks, you only exchange data when there is new information.
Designing Webhooks for Your Application
Mitchell says that “webhooks require prearrangement” with the service, and you need to provide the information required for it to know where and how to send you data.
In the case of GitHub, for example, this would include a payload URL, content type, secret (token for security), and the events you want to trigger the webhook. She also cautions that “most webhooks are like the Internet of Things,” meaning that they are “not very good” when it comes to security. So you need to think about how to handle security to avoid DDoS and other potential security issues when using webhooks.
She also mentions two use cases to consider when designing your webhooks:
- “Try to include all information for common outcomes.”
- “Consider impact of payload size vs potentially many follow-up API calls.”
When receiving your webhook data, Mitchell emphasizes that “It’s just a POST request!” and offers this advice:
- “DO: accept, store and acknowledge quickly.”
- “DON’T: validate or process before acknowledging.”
Mitchell says, “The internet can be bursty — accept and acknowledge the request. Do the processing later to avoid holding open the network connection and creating a bottleneck.”
The easiest way to do this is to accept the data, drop it into a database along with a status field (new / processed / failed), and acknowledge it before doing any processing of the data. This allows you to create a cron job for a convenient time to process the data, do whatever work you need to do, and update the status field.
“If you outgrow the database method, use a queue,” Mitchell then advised. In this scenario, you accept the data, drop it in a queue, and acknowledge it.
For this task, Mitchell admitted that she’s “drunk the RabbitMQ Kool-Aid,” but other options include Beanstalkd, and Gearman. Whichever back-end queue you choose, putting the data in a queue gives you an “easy way to link many workers with work to do” to process your data, she said.
If you have a popular online service, you might also need to offer webhooks for your users to consume services. Mitchell said that offering webhook integrations is ideal if:
- you have clients polling your API a lot.
- it’s common for another system to react to changes in your system.
- you want to offer notifications for specific events.
- any of the above apply either internally or externally.”
Mitchell summarizes the talk at end with one simple statement, “Webhooks … are awesome :)”