Three Key Design Considerations for an Event-Driven Architecture
Why would a company be interested in moving to event-driven architecture (EDA)? Ajay Nair, Principal Product Manager (AWS Lambda) at Amazon Web Services (AWS), speaking at the recent Emit Serverless Conference in San Francisco offered one example.
An AWS Lambda customer has 130 functions in production. After the move to EDA, its deployment time dropped from thirty minutes to seconds, Nair explained. It is now shipping fifteen times more features every month and saw a savings with 97 percent reduction in cost. The company is able to realize cross-benefits, agility, and time-to-market, he said.
The Emit conference was focused on Event-Driven Architecture, and AWS is leading the drive to evangelize this new technique. In this new world, it’s important to be a good citizen, said Nair. What it boils down to is a simplified architecture practice. “All logic is embodied as functions,” he said. “There are some things called events that trigger these functions and then the functions talk to something down the stream, which in turn then themselves maybe event processes or archetype or actors acting on that particular thing,” he explained.
The key characteristics are that EDA communicates through APIs and events, and the actors are ephemeral and stateless. This creates a separation of logic from data, cache, and state.
So how do you go about being an effective event-source provider in this particular environment? Nair mapped out three key considerations.
Start with a good thought-process about what your scenario is for the event process itself. “If it needs your service to be involved in it,” Nair said, “make it a notification.” If it’s something that you can just pass on forward, then be completely disconnected put the payload in that notification.
Be smart about what goes into your payload, he said. Don’t overstock it with information you don’t need. As a baseline, all events must contain providence information. Who’s the source? When did it happen? What was the relevance of the event? Nair pointed out that time-stamp information is particularly useful.
What happens next depends on your philosophy of the event scenario. One path is to use the event as a notification. The event tells you that something interesting has happened. One of Nair’s teammates calls this the “passive-aggressive notification,” he said, “in the sense that, ‘I’ll tell you something happened but if you want to know more you have to come back and talk to me. I’m not going to tell you what happened.’”
The trade-off, said Nair, is you have much more live communication across the services. However, you end up with a tighter coupling because now each downstream service is now aware of the one that’s upstream. The other important trade-off is the potential doubling of traffic on your event source. “Are you okay with that additional traffic coming back to you?” he asked.
This assumes that the function has the ability to talk back to the original service and ask for what was interesting about that event. The second model is the opposite, where you can’t talk back to this event-source. This makes sense in a connected devices story, Nair said, where an Internet of Things device is creating the event. Or if your service doesn’t have a public end-point or something that customers can call back to you.
In these scenarios, a developer will put the payload on the object itself. The downside is you are handling much more data. And, Nair pointed out, you now have to consider security constructs with what actually goes into the event, and how it gets passed forward.
So if you’re a provider, he said, put some thought into which of these things matter to you. Is the information that you’re passing to parcels downstream valuable to stop in the payload itself? Or would you require it to talk back to you? It could be something as simple as a call-back URL that you put in there saying that, “Hey, this is how you callback and talk to me if you need to when you’re ready.”
Self-listed events don’t really procreate, explained Nair, and event-streams will probably be critical components moving forward. Although there are scenarios where they’re optional, queues and streams give you a durable and reliable way for events to be replayed, recreated and otherwise to move forward.
“If you have state transfer and your event source doesn’t have a concept of state,” Nair said, “it needs to land somewhere.” The best solution is an event store in the middle. This is an important decision and not something to be taken lightly, he said.
On the plus side, an event-store provides enhanced potential durability and retention. With an event store, events are available even if the event source is down. Since the events are still there, they can be revisited, replayed, used for rehydrated production stores.
The quintessential scenario here, he said, is the event-sourcing model. “If you have a data-store that does how you chain, blog and publish, you want it to go into a durable store that then you can go back and replay.”
The downside is the need for additional storage. Also, data transfer and retention creates the complexity of another function being added, so there’s a service cost as well.
There are two cases where it an event store is not needed. The first is if you are using the S3 model or another event source that has persistent storage. The second is in a case where states talk back. “This is a fancy way of saying: if it’s a synchronize invocation don’t bother,” he said.
There are two basic event store models: streams and queues. The advantage to the streams model is that your events can be processed in order, and multiple consumers can be processing the same event list.
The downsides are how do you deal with sharding or distribution of your events across all these different ordered collections that you have within your suite. Concurrency is restricted. And routing/filtering rules are more complex.
“The queue is acting as a durable stow-away,” Nair said. So in cases where a customer’s entire fleet is down, the unprocessed data is not lost. Because the event itself is present on the queue, you get better scaling and ease of use, and it provides concurrent processing on events.
The downsides to be considered are the order of events are not guaranteed and consumers are limited.
Nair said, “It’s valuable to consider ‘Do I expose say, an event-stream or an event-queue, which then any consumer can go and publish off of.’”
The third decision to be made regards the possible use of routers. The developer needs to think about what capabilities are needed for a router to be present, Nair said. A lot of the EDA capabilities are implicitly bound to the down-stream actor. In many cases, you find service providers embodying these capabilities in one form or another. Lambda allows you to run pub sub-construct on top. This gives you the ability to securely associate multiple sources to multiple destinations.
In the world where events are flowing all over the place, he said, it’s essential to make sure that anyone who is making an event is authorized to do so, and anyone that’s consuming an event is authorized to do so. Also, verify that the person who’s emitting it is the person who says it is.
For example, he said, “you don’t want to have someone say ‘Here’s a 100 thousand dollar credit to my account’ and slip that into the event stream without anyone else knowing about it.” It’s critical to make sure that the mapper itself has this ability to map securely between those two.