Using Machine Learning to Actively Secure Cloud Native Apps
Traditionally, system and application security has been viewed as playing defense: Encircle your castle with a moat and protect it from any and all invaders.
But the cloud makes the defensive, castle-and-moat approach to security obsolete. With more organizations than ever not only moving to the cloud but moving to multiple clouds — combined with on-premises servers and edge environments — there is no single “castle” to be defended.
Also, not only are the castles distributed geographically but when applications are built as containerized microservices and orchestrated by Kubernetes, those “castles” may build, multiply or disappear suddenly.
“Most of the first-generation, cloud native security solutions are failing,” noted Ratan Tipirneni, president and CEO of Tigera, a security and observability company.
Though such tools have been an improvement over legacy firewalls, they haven’t matched the scope of the problems with security on the cloud, Tipirneni told The New Stack.
In a distributed cloud environment, security is a “multidimensional problem,” he said, and early cloud-security vendors took a “unidimensional” approach.
“Most of these vendors have come from the roots of vulnerability management and image scanning,” Tipirneni said. That factor is important, he added. “But that’s only one of eight elements that you need to worry about.”
The eight elements organizations need to cover in a Kubernetes-run cloud system, he said, include:
- Finding the vulnerabilities in the images at build time.
- Detecting misconfiguration in the images: Is SSH enabled in the images? Is the privilege set at the wrong level?
- Misconfiguration in the Kubernetes environment.
- Detect vulnerabilities and malware at runtime
- Prevent exfiltration of data
- Integration with firewalls: Kubernetes has dynamic IPs, while most firewalls are designed for static IPs.
- Finding new zero-day threats.
- A mitigation strategy that uses microsegmentation to prevent malware and ransomware from spreading.
Another circumstance that has made previous security strategies obsolete is the broad and rapid adoption of open source software. Nearly all enterprises use some open source components in their infrastructure and applications, and those components may carry unknown vulnerabilities. (Witness the recent Log4j example.)
To meet these challenges, Tigera announced on Feb. 10 significant enhancements to Calico Cloud, its cloud native application protection platform (CNAPP). The new version of Calico Cloud constantly scans images for vulnerabilities and misconfigurations, providing real-time observability of Kubernetes clusters.
It also employs machine learning to create a threat-defense system and integrated security policy to immediately mitigate malware so they don’t spread to other containers.
Zero Trust, More Automation
For enterprises and other customers, Tipirneni said, “the problem statement has been redefined” by the move to cloud and the heavy reliance on open source software components.
Organizations already know they have vulnerabilities in their systems, he said, and are demanding three things from cloud security these days, which require a more active approach:
- Reduction in the attack surface through zero trust
- Detect known and unknown threats.
- A remediation strategy for handling the period between finding a vulnerability and implementing a fix. As Tipirneni said, “If I don’t have a mitigation strategy, I’m faced with a hard choice of either shutting down my service application or system — or opening myself up to some potential breach, and I have to face the consequences.”
The principles of zero-trust security are based on the assumption that an invader is already in your system. It focuses on protecting sensitive infrastructure and data (known as the “attack surface”) requiring authentication and authorization of users to access parts of that surface, usually based on their role in the organization or team, for a limited period of time.
One key aspect of zero trust is automation. Because human beings are fallible — sometimes forgetful, unreliable, absent, too trusting, lazy and/or prone to taking shortcuts in order to meet deadlines — many of the current generation of tools to enable zero-trust architecture automatically generate authentication credentials and authorization for access, rotating credentials and enforcing time limits.
But this is one-half of a zero-trust strategy, the defensive approach. An active approach to cloud native security involves observability: finding vulnerabilities and misconfigurations in Kubernetes clusters, so that they can be remediated quickly.
Machine Learning for Threat Mitigation
When a threat is detected, mitigation must not only start immediately to protect the rest of the system, but also in a way that doesn’t force outages that cripple your business.
The Log4j debacle in late 2021 underscored the need for automated threat mitigation, Tipirneni said. “There was a nerve-wracking two-week period when the vulnerability was discovered, and there was no fix,” he recalled. “And the sad part is, every hacker on the planet was going in weaponizing this and using the Log4j vulnerability to actually inject malware.”
This period, he said, brought DevOps teams some hard choices: “Do I shut down the service completely?”
Calico Cloud now uses machine learning to generate security policy recommendations at runtime. “You can isolate that [vulnerability], or isolate any other parts of critical infrastructure, and really keep your service running until you know there’s a long-term fix and remediation.”
Log4j, however, is now a known vulnerability. What’s really scary, Tipirneni noted, are the zero-day threats — the unknown threats that lurk in your system and may be hard to detect until it’s too late.
Tigera has used Extended Berkeley Packet Filter (eBPF) technology to build powerful probes that detect file system, system call, process-level and network-level behavior of each workload in a system, using that information to establish a baseline of behavior. When a workload deviates from that baseline, Tipirneni said, “we then treat those as indicators of compromise or indicators of attack.”
Having flagged the potential threat, Calico Cloud then gives users options of actions they can take with the compromised container: quarantine, pause, generate an alert that requires human intervention or even terminates the container.
“We perform these in milliseconds, so we have near real-time response,” Tipirneni said.
Security Tools to Help ‘Shift Left’
Calico Cloud is designed to help developers take more control and responsibility for the security of the apps they build, Tipirneni said.
“There’s a little bit of a fallacy that developers don’t care about security,” he said. “But every developer I know cares deeply about security, it’s just that the current state of security tools prevents them from doing their job. And that’s why they’re trying to work around it.”
Workarounds are rife, and develop alongside onerous procedures for gaining access to infrastructure, according to a survey released in January by strongDM.
Of the 600 DevOps professionals surveyed, 53% said it can take anywhere from hours to weeks for them to gain the access they need after making a request. No wonder 65% said they use team or shared logins, and 42% share SSH keys.
Tigera’s approach, Tipirneni said, aims to give developers more control over setting security policies when they build their apps. But it balances that with the needs of the security group within an organization.
Giving devs more control over security policy could sound “scary” to SecOps professionals, he acknowledged. But a balance of power is baked into Calico Cloud, he said, with a tiered security system.
“We enable the developer to use role-based access controls to configure a set of security policies, yet we give the security team the right and the control to configure security policies that can protect the entire enterprise’s infrastructure,” Tipirneni said. “So that even if the developer either accidentally or maliciously does something, the blast radius is limited.”