Confidential Compute on Azure with Kubernetes
Unless you’re extremely good at security, hyperscale cloud providers are probably better at security than your organization: they have more security expertise, they patch faster, they run background checks on admins, and they have strong operational security.
But there are still risks, whether from your own admins or attacks on cloud data centers, vulnerabilities in the guest or host OS — or just the fact that while data is encrypted at rest and in motion, it’s not usually encrypted when you’re using it.
That might be enough for extremely sensitive data and confidential workloads in regulated industries to make the cloud unsuitable. Confidential computing keeps data encrypted even in use, in memory, and during computation, so you stay in control of your data from the time it’s created to when you delete it, and it’s never exposed to malicious insiders, admins or hackers, even if there are security vulnerabilities in the OS or hypervisor (assuming there are no bugs in the confidential computing stack itself).
That’s because the computation happens in a hardware-based trusted execution environment (TEE) where you have verifiable assurance for data and code integrity and data confidentiality. As well as memory and the data in it is encrypted, the code you run in the cloud is protected, and you can verify that it hasn’t been tampered with (and the activity history is immutable and auditable too).
With confidential cloud, ”data is in the control of the customer during its entire lifecycle, whether that’s at rest, in transit, or in use,” Vikas Bhatia, head of product for Azure Confidential Computing explained at the recent Microsoft Ignite conference. “The cloud provider is outside the trusted computing base. The code that you’re running in the cloud is protected and verified by you, the customer with remote attestation capabilities.”
“What we see today is our customers are looking to trust as little as possible,” he noted. “They want full control over the data lifecycle.”
Analysts Everest Group predict that confidential computing is going to grow quickly and could become a standard for end-to-end security, particularly for the public sector and enterprises in banking, financial services, insurance, healthcare, life sciences, defense and other regulated industries or where critical infrastructure is involved.
Bhatia listed early Azure confidential users including regulated industries like telcos, healthcare teams in disease diagnostics working with data from multiple healthcare providers in a confidential environment they can tear down completely when the research is complete, retail and advertising companies who want to do multiparty machine learning and financial services organizations building anti-money laundering systems.
“This is enabling net new scenarios in confidential computing that were not possible before,” Amar Gowda from the Azure Confidential Computing team explained in a session. “This is allowing two different institutions that could not collaborate on data because it has PII into this environment. Now because of attestation and memory protection and integrity protection, you can rest assured that the data does not leave the boundaries [or get] in the wrong hands.”
Confidential Computing on Azure
Confidential computing starts with the hardware root of trust; Azure has confidential virtual machines using Intel SGX, AMD SEV-SNP (in preview this month) and NVidia A100 tensor core GPUs and Ampere-protected memory that have a secure channel between trusted execution environments on both the CPU and the GPU (in limited preview for ML training and large data AI workloads where confidentiality and integrity is key).
Intel SGX stands for Software Guard Extensions, where you partition your app into an untrusted and trusted region, and the sensitive code goes into the trusted environment. Both VM memory and RAM are encrypted, and there’s Enclave Protected Cache (EPC) memory specific to the application. With SGX the amount of encrypted memory used to be quite small, but the current DCsv3 generation of VMs has much more encrypted memory (up to 256GB) for large data workloads.
AMD SEV-SNP stands for Secure Encrypted Virtualization and Secure Nested Paging, which offers hardware protection against malicious hypervisors as well as encrypted memory: the virtual machine memory is entirely encrypted and integrity protected with keys generated by the AMD CPU that can be kept in Azure Key Vault or Azure Managed HSM (which itself relies on confidential computing), plus you can choose to pre-encrypt the operating system disk. That doesn’t need any code changes, so just deploying a workload into a confidential DCasv5 or ECasv5AMD EPYC VM makes an application confidential with minimal performance difference.
Not having to rewrite code to use confidential computing makes it easier for Microsoft to provide its own cloud services as confidential computing services (and offer commercial clouds like the new Microsoft Cloud for Sovereignty aimed at governments who want to use the public cloud).
SQL Azure has had an always encrypted option using Intel SGX for several years but you can now run SQL Server IaaS on AMD confidential VMs, Azure Virtual Desktop can now run Windows 11 in the cloud on AMD confidential VMs and a confidential computing version of Azure Databricks is likely to be announced this year.
If you’re using the open source Confidential Consortium Framework to build decentralized networks, it’s likely that there are multiple organizations involved: the new Azure Managed Confidential Consortium Framework (built on Azure confidential computing and currently in private preview) avoids one organization having to run the infrastructure for that network.
Azure Confidential Ledger, a secure tamper-proof blockchain-based ledger uses CCF running on Azure Kubernetes Service (AKS), Graham Bury, a principal PM manager on the Azure confidential computing team, told the New Stack: “You can think about a lot of these services that Microsoft builds as managed PaaS services that just happen to run on Kubernetes leveraging confidential computing.”
But Microsoft also wants to enable confidential computing for customers who are building on Kubernetes.
Confidential computing can protect containers as well as VMs. Azure Container Instances (which is good for isolated container scenarios that don’t need orchestration, like machine learning and AI workloads or short-lived workloads you want to burst to the cloud securely) now has serverless confidential computing in limited preview (a public preview is due soon).
This doesn’t need changes to your container images and gives you a dedicated hypervisor with in-memory encryption per container group. It also has full guest attestation, so you can verify that the container is only running the components that you expect to run.
Microsoft says this is popular with data scientists deploying Python containers to use with Azure Machine Learning, which can be made confidential without code changes that might affect the model.
When you do need orchestration because each node on AKS is a virtual machine in a Virtual Machine Scale Set that you provision in Azure, those VMs can be confidential VMs.
AKS has actually had confidential computing support since 2020 as the first cloud using Intel SGX VMs to run containerized apps built with the Open Enclave SDK but that meant making changes to those applications to partition apps so the trusted code will run in the SGX enclave or using third-party tools like Anjuna, Edgeless, Fortanix or SCONE that handle that for you. Signal, the secure messaging service, uses Intel SGX nodes in AKS for storing user contact details where neither the admins at Signal nor Microsoft can view them.
Now Azure is the first cloud service to support AMD SEV-SNP confidential computing in Kubernetes. Because the confidential virtual machine node pools can now use AMD confidential VMs, you can just lift and shift your containers into a confidential environment, or make an existing AKS cluster to a more secure state by adding a confidential node pool to it — but you don’t have to make the entire cluster confidential if you only need the extra protection on specific nodes where you’re processing sensitive data.
Confidential node pools work with the full AKS feature set, like autoscaling, AKS addons, Azure CNI, Azure Defender for Containers and the rest. They use a customized Ubuntu 20.04 image(Microsoft is partnering with Canonical to make sure that all Azure confidential services will be supported for Ubuntu). ARM64 and Mariner images aren’t currently available but Gowda said that Windows Server nodes will be coming soon.
You deploy a confidential node pool for AKS the same way you currently deploy node pools, using an ARM template or the Azure CLI or portal to create a node pool and pick the VM size to use for that — just pick the DC-Series or EC-Series VMs that offer confidential compute. You don’t need to change the code that runs in the container or the container image: just edit the pod YAML spec to deploy the confidential node pool (if you use node affinity, you can pick the confidential computing node pools as an affinity).
As well as the data-in-use protection of using memory encryption, you also have remote guest attestation so you can be confident your workloads are deployed in the environment you expect and that only what you put in those containers is running on that hardware.
Confidential virtual machine AKS node pools are generally available now. The AMD confidential VMs that confidential node pools are based on are already available in East US, West US, North EU and West EU regions with Southeast Asia coming soon and Microsoft plans to expand them to more regions “in the near future”, so all those regions have confidential node pools.
Confidential Becomes Common
Not needing to rewrite code to use confidential computing in AKS makes this an appealing option for customers who have Kubernetes-based applications on their own infrastructure, where they’re unlikely to have confidential computing enabled because the server hardware to do this is very new.
They can now lift-and-shift applications they’ve not been comfortable taking to the cloud because of concerns about privacy, security compliance or data regulations, Bury suggested. “The aspiration is how can I move more of the workloads to the public cloud because I want it there by default: bring my whole container workload as is and I don’t even have to think about it because it’s encrypted memory.”
That applies to IT organizations running code build agents and code signing to achieve software supply chain hardening, financial institutions to execute data processing pipelines for dynamically spinning up containerized jobs, and telecommunication providers to meet Schrems II and other data regulatory compliances, he said. “As well, our internal Microsoft code signing services are onboarding to our confidential VM capabilities in AKS.”
If you want to run Kubernetes on Azure yourself, you can use confidential computing VMs to host it and manage the Kubernetes deployment yourself. Some customers do that with AKS-Engine or the Kubernetes Cluster API Provider for Azure, Bury noted, “but most of the customers we talk to look for us to bring confidential computing and added isolation directly to our managed AKS.”
In the long term, once hardware is widely available Microsoft expects confidential computing to go from a specialist requirement for things like multiparty data analytics that needs “heroic” data protection to as standard as encrypting data at rest and in transit.
“We’re expecting to see compute to evolve in general from computing in the clear to computing confidentially and both in the cloud and on the edge,” Bury said. “We do see it being much more general purpose over time. We expect to have confidential computing capabilities be pervasive across our infrastructure platform over time as we can make the hardware with those data protection capabilities pervasive. If we could update all of our hardware overnight and have these capabilities that would be fantastic!”
Confidential computing will also start to align with software developments intended to create more isolation in Kubernetes.
At Kubecon+CloudNativeCon this year, Microsoft announced an upcoming limited preview of support for Kata Containers in AKS, with a lightweight VM that runs in a dedicated kernel per pod, using VT virtualization extensions to create stronger workload isolation for network, memory and I/O. That promises higher security for different workloads on the same cluster and Bury hinted that isolation could make sense as one of the ways that Microsoft contributes confidential computing concepts into the open source Kubernetes space.
“With Kata containers, we can see a unification of the open-source isolation technologies available across AKS and specifically leveraged for our confidential computing stack in AKS too,” Bury said.
“AKS and Kubernetes in general can benefit from things like Kata containers to bring that level of container isolation with that specially tuned kernel, and then we can think about what we do in confidential computing, where you have you have a specific piece of hardware enables you to verify you’re running on that hardware with that added data protection, with memory encryption.
It’s just a matter of time when we can get all of these things working together,” he suggested. “We can create more and more isolation and data security and protection while keeping that Kubernetes native experience.”
He also pointed to the confidential computing support in ACI as an example of what Azure wants to offer for Kubernetes. “How can we be very user-friendly with their containers having as much isolation and data protection [as possible] from a security and privacy point of view?”