More Lessons from Hackers: How IT Can Do Better
Kelly Shortridge is an advocate for better resiliency in IT systems. The author of Security Chaos Engineering: Sustaining Resilience in Software and Systems and a senior principal engineer at Fastly in the office of the CTO spoke at this year’s Black Hat conference. She explained why attackers are more resilient and what IT organizations can do to become more resilient and responsive.
Recently, The New Stack looked at Shortridge’s recommendations to leverage Infrastructure as Code and the Continuous Integration/Continuous Development pipeline to improve and become more resilient. In this follow-up post, we’ll look at the final lessons IT can take from attackers to improve their security posture:
- Design-based defense
- Systems thinking
- Measuring tangible and actionable success
Design-Based Defense: Modularity and Isolation
“The solutions that actually help with this aren’t the ones we usually consider in cybersecurity or at least traditional cybersecurity. We want to design solutions that encourage the nimbleness that we envy in attackers, we want to design solutions that help us become the best ever-evolving defenders,” she said. “The less dependent it is on human behavior, the better it is.”
She created the ice cream cone hierarchy of security solutions to demonstrate how organizations should prioritize security and resilience mitigations. As an example of a design-based solution, she pointed to Kelly Long’s push to use HTTPS as the default for Tumblr’s user blogs.
“That’s a fantastic example of a design-based solution,” Shortridge said. “She knew that security should be invisible to the end users, so we shouldn’t put the burden of security on end users who aren’t technical. I think she’s really ahead of her time.”
Instead of offloading that work onto end users and peers, IT should try to automate security and use design-based defense when possible. That means deliberately designing in modularity. Modularity allows structurally or functionally distinct parts to retain autonomy during periods of stress and allows for easier recovery from loss, Shortridge explained. A queue, for instance, adds a buffer, and message brokers can replay and make return code non-blocking.
“Message brokers and queues provide a standardization for passing data around the system. It also provides a centralized view into it,” she said. “What you get here is visibility, you can see where data is flowing in your system.”
Modularity also supports an airlock approach with systems so that if an attack gets through, it won’t necessarily bring your system down. She demonstrated an air gap between two services talking to each other who a queue in between. The queue allows you to take the service offline and fix it, while service A continues to send requests, which the queue handles, allowing the service to stay available and functioning until the fix is put into place.
“Modularity, when done right, minimizes incident impact because it keeps things separate,” she said. “Modularity allows us to break things down into smaller components and that is much harder for attackers not only to persist if it’s ephemeral, it makes it harder for attackers to move laterally and gain widespread access in our system.”
Mozilla and UC San Diego have used this approach and have reported they no longer have to worry about zero day attacks because these sandboxes of components give them time to roll out a reliable fix without taking the system down, she added.
Repeatedly, the speaker at Black Hat said attackers are “system thinkers.” Shortridge reiterated this in her talk.
“Attackers thinking in systems, while defenders thinking in components, [which is] especially apparent when I talk to security teams, and thinking about how traffic and data flows between surfaces that’s often overlooked,” Shortridge said. “We’re so focused as an industry on ingress and egress that we miss how services talk to each other. And by the way, attackers love that we missed this.”
Attackers tend to focus on one thing: Your assumptions. You assume parsing the string will always be fast or the messenger set that shows up on this course will always post authentication or an alert will always fire when the malicious executable appears. But will it really? Attackers will test your assumptions and then keep looking to see if you’re just a little wrong or really wrong, she said.
“We want to be fast, ever-evolving defenders, we want to refine our mental models continuously rather than waiting for attackers to exploit the difference between our mental models and reality,” she said. “Decision trees and resilient stress testing can help us do just that.”
Decision trees can help find the gaps in your security mitigations, she said, and force IT to examine the “this will always be true” assumptions before attackers do. Reliance stress tests — called chaos engineering in security circles — build upon decision trees, helping to identify where systems can fail.
“Chaos engineering seeks to understand how disruptions impact the entire system’s ability to recover and adapt,” she said. “It appreciates the inherent interactivity in the system across time and space. So it means we’re stress testing at the system level, not the component level as you usually do. It forces you to adopt a systems mindset.”
Measuring Tangible and Actionable Success
Attackers have another advantage — they can measure success and receive immediate feedback on their metrics. Attacker metrics are straightforward: Do they have access? How much access do they have? Can they accomplish their goal? Security vendors, by contrast, often struggle to create lucid, actionable metrics — especially metrics that offer immediate feedback, she said.
”We want to be fast ever-evolving defenders, we need system signals that can inspire quick action, we need system signals that can inform change,” she said. “It turns out reliability signals are friends here, they’re really useful for security.”
IT security should learn and use the organization’s observability stack, she advised. They can even help detect the presence of attackers, she added.
“Again, attackers monitor the system they’re compromising to make sure they’re not tipping off defenders, or tripping over any sort of alert thresholds. So in the resilience revolution, we want to collect system signals, too, so we can be fast and ever-evolving right back,” she said.