Your Security Just Might Kill Your Serverless
Let me start with an anecdote. In the midst of a fascinating discussion with the security person in a large company that has embraced serverless, I asked her how it came about that the security organization doesn’t own the security controls to the application. “How do you guys let the developers own the IAM roles and VPC decisions?” I asked. Her answer was astonishing but also enlightening: “We tried, but the developers threatened they would all leave.”
My instinct, and perhaps yours, was to think, “Oh my God, these developers have really taken over the world.” It’s easy to attribute this to some sort of megalomania and turn it into a story about how hard it is to find a good developer these days. But that would be entirely missing the point, and the point is important.
The reality is far more illuminating. To understand it, you need to go back to the basics of serverless. There are several key values that companies tend to connect with when it comes to serverless, including lower cost of compute, simplified operations, and infinite scaling. Ultimately, though, the biggest gains tend to come from application velocity. The move to serverless tends to create DevOps environments that are agile to the extreme, environments where developers often release bug fixes and new features every few days or even hours.
This increase in application velocity is not lost on users of these applications, who have grown to expect new features and improvements to come at a staggering pace. Serverless helps deliver this pace, but it comes with some costs. Managing the process and ever-shifting mountains of code is becoming a challenge for many organizations. Monitoring and troubleshooting can also be a challenge.
Security is also at a crossroads. The developers in the story were not drunk on their own power. They were simply stating the obvious. The traditional way we did security, where developers wrote their code and packaged their workloads, and security operations put security controls around those workloads, just won’t work for serverless. Developers can’t possibly keep up with the hyper-accelerated velocity they themselves created if they need to wait on security to open ports, IAM roles or security groups for them.
Find the balance where developers don’t own security, but they aren’t absolved from responsibility either. Redesign how security controls are applied, so that developers have their control, but also are prevented from doing things that create risk.
Instead, where they can, they have grabbed the reins, and this leaves security organizations trying to play catch up and decide how to react. One option is to accept this fate, and try to focus on monitoring and alerting on issues, but we all know that security whack-a-mole is a lousy place to be. The other is to try and reclaim control, but if there is one rule of security, it’s to make sure you are aligned with the business, and slowing down developers is usually not aligned with the business.
So you’re damned if you do and damned if you don’t. Unless you try a third option, which is to embrace serverless for what it is. Find the balance where developers don’t own security, but they aren’t absolved from responsibility either. Redesign how security controls are applied, so that developers have their control, but also are prevented from doing things that create risk. Use tools that let security apply the controls that they need, but in a way that doesn’t interfere with developers’ ability to change their code rapidly. Create processes that make developers aware of the security risks that their code or configuration creates, in a way that helps them resolve those risks at the speed of serverless.
Perhaps an example will make this more concrete. Let’s take a simple case, but one that can have a significant impact. Serverless functions have a timeout. Developers can configure this timeout to anything from a tenth of a second to many minutes. As a developer, my instinct is often to set the time to the maximum, since I don’t pay for the time I don’t actually use. As a security operator, my instinct is to try and shrink this timeout the minimum necessary, as that makes attackers lives harder if they attack our application.
The old world options are either developer own timeouts, and security people can only detect the problem later, and put it in a report, OR security people control timeouts, but then developers need to negotiate timeout changes as they modify their functions which slows down the process.
The serverless option is to have tools that automatically identify this sort of discrepancy, and communicate with both the developer and the security operator. The developer gets all the information that supports making a better security decision, including what the problem is, how we know it’s a problem, why it creates a security risk, and how the problem can be mitigated. The security operator gets tools to track the problem and communicate with the developer about how it will be handled, along with, perhaps, the option to apply remediation from the security level if necessary, or reconfigure security around the problematic function until the problem is properly sorted out.
The old us-vs.-them world is gone, both because it was never that good, to begin with, and because it really fundamentally doesn’t work in serverless. Embrace the new world, where each team owns their part of the security puzzle but has tools and processes that support collaborating on prevention, detection and remediation with the rest of their peers, and your organization will get the most of the move to cloud native and serverless.
Feature image via Pixabay.