DevSecOps: Security Automation in Enterprise DevOps
Another day, another portmanteau. DevSecOps — an expensive target on AdWords — tries to fit security into the DevOps process. It’s kind of silly because of course companies should be factoring security into their development, particularly when much of DevOps is about enterprises releasing applications faster.
Amazon Web Services’ Senior Solutions Architect Margo Cronin kicked off her talk at the European DevOps Enterprise Summit by saying how personally she doesn’t like the term DevSecOps.
The term DevSecOps “has always struck me like the last kid getting on the bus and there’s no seat available. We are treating security as an afterthought. Security has never been an afterthought with any customer I dealt with — in financial services or now at Amazon Web Services. I feel like the name doesn’t reflect the importance,” Cronin said.
In fact, with the new European regulations of GDPR, she says privacy by design and privacy by default are built right in.
“It nearly mandates you should be doing DevSecOps,” she said.
So, in the end, what is DevSecOps to Cronin? Security automation as a priority early and often. And coming from Amazon, she comes from a place of security automation experience. After all, AWS released almost 1,500 key features and new services in 2017. Amazon.com performs a production deployment of once every 11.6 seconds and an average of 10,000 hosts simultaneously. How does AWS and Amazon.com do DevOps but still retain core security practices?
“If there was a security problem, it would make the evening news,” Cronin said.
In the end, you can call it DevSecOps or Rugged IT like agile and kanban enthusiasts call it, but how AWS refers to it is pretty accurate: security automation at cloud scale.
Security Automation Evades Human Error
Cronin says that Amazon has embraced DevSecOps “so passionately” because they needed the DevOps to move fast and, considering their considerable partners, that meant automating security first in order to avoid these three trappings of humanity:
- People make mistakes. Cronin asked the audience to imagine they were engineers working late at night, with a Severity One in production. “You have an IT C-level stakeholder telling you to get the service back online. You are on hour five of a Severity One call. You are on a Slack channel with 40 people, 38 of whom are not really contributing. You are on cup of coffee number seven. You make a change in production to resolve this issue.” She said that under these often common circumstances, you are more prone to make an error than in a business-as-usual scenario, and maybe you forget to document the change and the next release overwrites the fix. “Humans make mistakes and when you’re under pressure you’re more likely to make mistakes.”
- People bend the rules. Then she shared a well-meaning common use case, people bend the rules in an effort to be helpful and to collaborate, like when you have scheduled a big release (and release party) and everyone’s ready to celebrate and it’s almost there, so you say: “We’re just going to get it out. We’ll do the release and fix that tomorrow. People will ask you to bend the rules from a place of goodness, but these create gaps in your product landscape.”
- People act with malice. “While attacks like DDoS are automated there is invariably a human behind the scenes instigating that attack.”
While a mistake is written into code or an automated process, these mistakes are frequently repeated, which creates patterns that are easily diagnosed.
Machines don’t make mistakes, bend the rules, or act with malice, which is why Cronin argues that automating security tasks must be your biggest priority for successful DevOps.
Four Steps to Enable Security Automation at Scale
Next, Cronin outline four steps toward security automation at scale, noting that “It’s not exhaustive but four steps I’ve seen companies do that adds value to DevSecOps process.”
#1: Establish Your Level of Trust
For this, she has what she calls her bar or spectrum of trust that you have in your cloud or on-premises provider. Cronin points out that large, distributed organizations with defined security processes and governance typically have low or zero trust. These low-trust companies want their own managed keys and their own hardware security modules. They are at one end of this spectrum.
On the other end of this spectrum lies startups and e-commerce platforms that use all the services of their cloud service provider.
“It doesn’t matter where you are in the spectrum, you can still get the value of the cloud, but the point of trust that you have is congruent to the amount of automation you need to implement,” she said.
Cronin gave the example of transferring trust with more automation. This Trust Zero could be a company deploying native Kubernetes, managing the master nodes (scaling and distributed consensus), the worker nodes, and all the security.
On the other end of the spectrum, at the right she offered an example where the customer with higher trust uses an AWS service called AWS Elastic Kubernetes Service.
“The complexity of standing up your own Kubernetes control plane is simplified. Instead of running the Kubernetes control plane in your account, you connect to a managed Kubernetes endpoint in the AWS cloud. This endpoint abstracts the complexity of the Kubernetes control plane — your worker nodes can check into a cluster, and you can interact with your Kubernetes cluster through the tooling you already know and love. By default, [in this scenario with AWS] Kubernetes role-based access control [RBAC] is on, volumes are automatically encrypted, and AWS does the certificate management.”
Here Cronin is referencing a common concern around Kubernetes, which has RBAC turned off by default. This is what happened last Spring to Tesla when someone was able to hack its control plane because nobody had enabled RBAC. By putting more trust into AWS’s Elastic Kubernetes, she says RBAC for Kubernetes is automatically turned on, has native integration with AWS, and managed master nodes.
She says that “No matter where you are on the trust scale, plan to integrate security automation, but remember that creating this automation will also take DevOps team time.”
This involves mapping the tooling based on where you are on the illustrated trust bar. The lower level of trust, the higher the level of security automation the DevOps team needs to implement. The higher the level of trust, the more the cloud provider can automatically manage and automate for you. This impacts how quickly you are going to release your minimum viable product. Also the less trust, the more you have to plan your security ahead, like having to make sure your RBAC is on.
#2: Security by Design
Cronin contends that in DevSecOps, every team member feels the responsibility of a security owner — it’s no longer a team in another building just a stakeholder to your project. Just like DevOps tears down the silos between developers and operations, the same must happen for security.
With DevSecOps, sprints can be based on security needs. Breaking epics down to functional security stories. This security process used to take a couple months with on-premise hosting, with complicated Waterfall epics. Now she says that, with any cloud service provider, you can spin up the Web app firewall in sections, which usually are:
- Identity and access management
- Logging and monitoring
- Incidence response
- Infrastructure security
- Configuration and vulnerability analysis
- Securing continuous integration and continuous deployment (CI/CD) pipeline
- Data protection
You then use the same dynamic CI/CD pipeline to roll out your security features at you would with the rest of your DevOps.
Security by design also means having security-related acceptance criteria. Continuing with the example of GDPR’s requirements, which has a lot of prioritizing that users own their own data when a user logs into the system, you have demonstrated how that data can be deleted and demonstrate how you can actually port it over. Security automation has to test if that is possible and, ideally, document it all.
Security Automation Step Three: Securing the Pipeline
Cronin says the example above and much of this security automation advice can be applied to on-premise as well, with both how to treat the security of — including access roles and hardening of build servers and nodes — and the security in — including artifact validation and static code analysis — your CI/CD pipeline.
She suggested including the help of Git Secrets, a set of DevSecOps open source resources, and to leverage these tools and the cloud for such important CI/CD steps as:
- Authentication and validation in your repositories
- Logging across the entire environment
- Sending build reports to developer and stop everything if build fails
- Sending build reports to security and stop everything if audit or validation fails
Cronin also spoke about Infrastructure as Code, moving away from the classic horizontal stages of Develop > Integration Test > User Acceptance Testing > Push to Production and towards the verticals of Networking, Security, Applications, Logging, and Monitoring.
“Then if you believe a part of a stage is compromised, you can tear it down so quickly and then recover. You could take down the UAT [user acceptance testing] security block and leave the rest of UAT intact,” she said.
This process of security automation combined with the vertical stages above becomes highly immutable and reduces your blast radius.
#4: Automate Responses
For security automation to work, you need to know what you are doing based on your log files. It all comes down to four questions surrounding your logs:
- When are you collecting logs?
- Why are you collecting logs?
- Where are you collecting logs?
- What are you doing based on your logs?
Take the example of someone switching off an AWS Service. It can send an automated event to your security team for them to look into the environment. It allows you to make the decision if it was shut off by someone whose privileges are too high or if it’s actually an event that needs looking into and maybe servers need to be ring-fenced. Cronin pointed out how powerful logging has become and how logging in the cloud prevents more incidents.
In the end, when looking to automate security, it seems best to follow Cronin’s final words on the importance of the Sec in DevSecOps:
“If security is your most important job, you should look at automating those tasks and stories first, before anything else.”
Photos: European DevOps Enterprise Summit Twitter.