If you’re familiar with the way modern application orchestrators work, a config file enables a developer to specify the “intent” of the application — the ideal operating conditions for the application, if they’re reasonable. And if you know about how security policy tends to work, then you know it’s based on rules that are triggered by specific events. If an event or two are triggered, meeting the criteria of the policy, then the policy responds by executing a set of instructions.
Aporeto produces a service authorization platform — a way to govern the behavior of microservices sharing a platform, based upon policy. But as Aporeto CEO Dimitri Stiliadis admitted to The New Stack, some significant architectural changes need to be made to the entire microservices environment, including to the orchestrator — and the emerging habits of the people who manage it — if policy and intent are to finally come together.
Does the role of security in orchestration fall more under the purview of the applications developer or the security operator? Or should it eventually fall to someone in-between, whose role may not yet have been invented?
I don’t think security is something that belongs only to the developers, or only to the security operations teams. I think we have to understand that it’s more of a shared responsibility model. Because if developers of applications don’t think at all about security, or they don’t understand what security operations [teams] are doing, then obviously they are going to do things that try to bypass it. On the other hand, if security operations don’t understand what the applications are doing, then you end up with this game of Marco Polo. They are calling “Marco,” they are blindfolded, and they’re trying to follow them around.
I think that the answer to this is, we need a shared responsibility model. We need both the security teams to understand what applications are doing, and the applications teams to understand what security [teams] are doing. And I think that has been a big issue for the application deployment industry for the last 20, 30 years. Very often, the API — the interface between applications teams and security teams and operations teams — tends to be a spreadsheet. [With] people who do a compliance spreadsheet, or a security spreadsheet — the security team will send it on up to the applications team, and then the applications team will fill up the spreadsheet, then they will change something and fill it up again.
By bringing the two teams into the same platform — by essentially giving to security, visibility into what applications are doing, and giving applications teams visibility into what the security rules are — they can work together. As part of what we are doing, we are trying to translate the spreadsheet into a real API.
The way we’re approaching this, we’re saying that the security team can do their basic job, and they can define a box, if you want. They can define essentially a set of rules that govern what is a proper deployment, at least from a security standpoint. And then applications developers are free to move within this box, within these business rules. And as long as they don’t violate any of these business rules, things are flowing smoothly.
Now, if they violate these rules, then what happens is, the application stops working. Somebody’s going to complain. So at this point, the applications developers need to see what security policy or what security decisions potentially impacted the application, or they can potentially see what they did that was a violation of the security policy, and essentially made the application stop working. This is “two-sided visibility,” if you want.
The second part is a big philosophical [issue], on the question that you asked: Do we delegate security to the system, or do we delegate security all the way to the applications? And there, I’m a little bit influenced by some old work that I’m not sure how many people have seen. It goes back to the decision around some of the development of Plan 9, and the concept called factotum. The argument there was, by essentially delegating some important security functions — for example, the cryptographic functions — and making them a part of the system, then if you’re going to decouple them from the applications, the security team can decide what is the right cryptographic operations, the right encryption algorithm they should use. And they can update these at a much faster speed, rather than updating all the applications.
[Author’s note: The “factotum” to which Stiliadis referred is part of a 1991 architecture created for a special, open source rendition of Unix by Bell Labs. In that architecture, the factotum was an agent that represented the identity and interests of what we would call today the “security principal” — each individual user, including people and processes working on people’s behalves. Theoretically, placing heavy trust on the factotum enabled the system to adopt a single protocol for security operations, but have the functions behind that protocol become adaptable to new purposes.]
Let me give you an illustration of this. If every application did its own implementation of encryption and cryptography functions, then if some library is wrong, in order to fix the problem, you have to go fix all the applications. If you decouple this and make it part of the system, then the applications are consuming these security services out of the system. If some library goes wrong, then by changing this particular library in the system, or by changing the service the system provides, then automatically, your applications are updated to the latest security standards.
So there is value in decoupling some of the security and the cryptography functions from the applications, because that allows you to keep up to speed with what is the right approach for the security functions, and not make more overhead for the development teams.
I would think, in order to have that ideal state that you just described, where changing the library then updates the application, you have to have some agreement beforehand on a very generalized interface between the library and the application, so that whatever changes you make to the library, it effectively gets asked to do what it does the same way. In order to do what you’re looking for — which is a much more explicit interface between the security functions and the normal application functions — somebody has to take charge of the model-making. And I’m wondering what side of the fence that somebody is on: is that a developer, an operator, or some special, hybrid, half-Klingon, half-Vulcan that we haven’t seen yet?
I don’t think it’s either/or. Let me give you a concrete example:
A security policy might be that, if an application has a high level of vulnerability, then either the developer has to patch it within seven days, or after seven days, it’s going to block access to the internet for this application. That is a business security policy.
Another business security policy can be: if I’m sending traffic between two availability zones in AWS or Google, or if I’m sending traffic between my data center and my deployment in Google, this traffic needs to be encrypted. These are very clear security policy requirements.
So a security team should be able to have a system define these types of requirements, and say, “Guys, as long as you’re not violating these things, I don’t really care what you are doing, you are good enough.” On the other hand, the applications team is building three microservices, and they know that this microservice needs to consume this other service from this place, and this other microservice needs to consume this other service from this place, and therefore I’m going to express this intent. In a Kubernetes environment, I’m going to express it in a Kubernetes policy; in another environment, I’m going to express it in some other policy. Because I know exactly what the application is doing, and as long as I’m telling you this, the application should be working.
If you want information about my microservices — their interactions, what they are doing — this is essentially the intent of the application. This is expressed by the application developers. The business policy intent that says, “If you spread your application between data centers and encrypt the traffic, your vulnerabilities won’t allow you to go to the Internet,” that’s essentially a security function that’s expressed by the security intent. Now, if you can put the two together in a system, and the system can apply the right policy to satisfy both requirements, and give an alarm to the developers if they are violating one of the security requirements, or give an alarm to the security team of the developers are violating the requirements, then everybody can move forwards. We don’t have this finger-pointing where, “Oh, my application is not working, because you changed something,” or, “My security is violated because you didn’t do the right job.” That’s what we are trying to do here: bring these two functions together.
It sounds to me that what you’re describing is a kind of “just-in-time” arbitration. Maybe we don’t actually have to get the people together on every microservice/library interaction, and detail that interaction on some spreadsheet somewhere. Instead, why don’t we charge the developers with expressing very explicitly what they intend their application to do, and what they intend it to consume — a type of declaration? And then have the policy on the other side of the fence saying, “This is what a declaration should be allowed to do; this is what declarations cannot be permitted to do.” And then at the time of orchestration, when these things are being staged, then there can be something that says whether the intention that comes from X violates the expression that comes from Y. Am I on the same page with you?
Yes, you’re on the exact same page. Then because things are fluid and they change — like, a new vulnerability is found after the application is running — at runtime, somebody should also have the visibility of whether the intent is constantly met, whether the security policy is constantly met, or whether it’s being violated and we have to do something about it.
So who is this somebody? A security operator? A CISSP? Or is this person essentially in the IT division, an operations professional?
I think all people should have visibility. Because who gets the call? Let’s assume I’m running an application, and something changes — because things change all the time. And then somebody will get the call if there is a failure. Something suddenly stops working. That’s the number one thing that people are afraid of — before they’re afraid of security or compliance or anything else, they’re afraid that something stops working.
So somebody gets the call when the thing stops working, usually an operations team, or someone like that. In this DevOps environment, this paging is also going to go to the developers very often, and it’s also going to go to the security team. Now, all of them should be able to have the tool in place to be able to see what is failing and why. “It’s failing because my API call between microservice A and microservice B is not working anymore.” Okay. Is this failing because a security policy got instantiated? Or maybe a microservice has been identified for a vulnerability? Or it was identified at runtime as having been compromised? Or because somebody pushed an update on the microservice overnight, and destroyed something?
Only by having the right visibility tools in place can people actually figure it out, and come to a fast decision. We’re again living in the time where we have DevOps teams, with developers and operations together; and that’s the whole concept. By providing the right tools for them, we can help them to resolve this paging. Nobody wants to be paged at two o’clock in the morning. That’s the whole point of this.
Aporeto is a sponsor of The New Stack.
Title image of construction workers is licensed under Creative Commons.