Google’s New ‘Autopilot’ for Kubernetes

Like most Kubernetes releases and services, the Google Kubernetes Engine (GKE) still requires a fair bit of manual assembly and tinkering to optimize for your needs. But recently, Google introduced GKE Autopilot, an easier take on managed Kubernetes. Its job: Enable you to focus on your software, while Autopilot manages the infrastructure. What does that really mean? Kelsey Hightower, one of Kubernetes’ best-known leading figures and a Google Cloud principal developer advocate, explained to The New Stack what’s up with Autopilot.
According to Google, with Autopilot’s launch, GKE users can choose from two different modes of operation — each with its own level of control over GKE clusters and the relative responsibilities related to GKE. On the one side, you have GKE Standard, which is how Kubernetes has worked on the Google Cloud since the start. On the other, you have Autopilot.
“Autopilot represents many of the best practices advanced users land on when security and uptime are top priorities. That comes with tradeoffs,” Hightower explained. “Applications that require elevated privileges are blocked by default. This ensures GCP can offer SLAs at the application and cluster level.”
In short, Autopilot is GKE with additional guard rails, not training wheels. “Autopilot is a more managed Kubernetes and not a PaaS,” he said. The service comes with support for the same configuration objects as regular GKE — including deployments, jobs, config maps and secrets. This means most users can keep their existing workflows and high-level abstractions, such as Helm.
With Autopilot all node management operations are eliminated. This maximizes your cluster efficiency and helps to provide a stronger security posture.
Distros for Kubernetes
Hightower sees Autopilot following Linux’s historic path. “Kubernetes has entered the distro era, just like Linux did before,” he said.
Individual Linux distributions focus on a set of use cases. For instance, CoreOS (where Hightower worked prior to Red Hat acquiring CoreOS) focused on running containers and security. Ubuntu focused on consumer desktops, while Red Hat itself focused on enterprise server applications.
Upstream Kubernetes would be closer to something like Gentoo, which is super flexible, but as Hightower points out, “You are essentially rolling your own Kubernetes distro once you add ingress controllers and other IaaS integrations to make it work.”
In this scenario, Autopilot would be closer to CoreOS — hyper-focused on running Kubernetes workloads on top of a secure and fully automated operating system.
In terms of competition, Hightower continued, “GKE Autopilot competes with other managed Kubernetes offerings.”
But what about Google’s own Cloud Run? Isn’t that a competitor? No, said Hightower, “We offer many compute offerings to meet the needs of a broad customer base. Google Compute Engine serves the needs of customers looking to leverage virtualization and familiar IaaS primitives including firewalls, load balancers and virtual machines, to build their own application platforms.”
“Kubernetes is Kubernetes.” Hightower continued. “Autopilot is an opinionated configuration of GKE. People use Kubernetes as a layer on top of IaaS to build their own application platforms. Kubernetes offers a higher set of abstractions, but you’re still rolling your own platform here, even if Autopilot eliminates the need to think about the underlying Kubernetes infrastructure.”
Hightower added, “Cloud Run is different. Cloud Run is an application platform. Cloud Run integrates the best of Google Cloud Platform including IAM, service to service authentication and communication, and traffic management, while presenting an even higher level of abstractions for running containerized applications.”
In some ways, Autopilot improves GKE security. For example, Autopilot enforces security — e.g. container isolation and forbidding privileged pods — for other security tools you’ve always used, such as those for TLS Certificate signing.
Autopilot also prohibits unsafe practices, such as SSHing into your container. Hightower argues that “eliminating ssh
altogether raises the security posture of GKE Autopilot. There’s no way to log into the underlying servers, which means one less attack vector to worry about. The absence of ssh
is also what makes other managed services including AppEngine, BigQuery, and Pub/Sub attractive offerings.”
As for TLS certificate signing, Hightower said it’s “an important part of the Kubernetes ecosystem, but without the right guard rails, that specific API can be abused to gain access to underlying infrastructure including nodes, which Autopilot works so hard to abstract away. Our plan is to add the proper guardrails to enable non-privileged identities to be minted via the TLS certificate API.”
As for application-layer secrets encryption which enables you to leverage GCP’s Key Management (KMS) service to encrypt Kubernetes secrets at rest, Hightower said, “This is a work in progress. Since an integration like this leverages an external service we got to make sure pricing is transparent and predictable. Both TLS certificates signing and application-layer secrets encryption are on the near-term Autopilot roadmap.”
One problem is, for now anyway, you can’t convert Standard GKE clusters to Autopilot or vice-versa. This makes porting between the modes impossible. That’s true, said Hightower, but while “GKE Standard supports a wide range of cluster configurations and pretty much any Kubernetes workload and extension a customer wants to run. By design, Autopilot supports a very specific Kubernetes configuration. While I can see a tool to help customers make an assessment on the feasibility of migrating between the GKE Standard and GKE Autopilot, I don’t foresee a world where we do that with GKE itself.”
If GKE users “follow Kubernetes best practices and manage their Kubernetes configurations externally,” added Hightower, “This not only makes it easier to move between GKE Standard clusters for disaster recovery or blue/green upgrade patterns, it’s also a valid approach for moving Kubernetes workloads between both GKE Standard and GKE Autopilot.”
“The best way to think of this is,” Hightower continued, “Autopilot builds on top of GKE with an opinionated configuration while continuing to support Kubernetes workloads. Autopilot puts some limitations and restrictions around those workloads to optimize for security and supportability.”
There is one set of restrictions you should be aware of before spending too much time on Autopilot. Your only operating system/container choice is Google’s own in-house cloud Linux and containerd. For now, those will remain your only choices. So, if your software stack is wedded to say Red Hat Enterprise Linux (RHEL) and Podman, Autopilot might not be a good choice.
The reason for this, Hightower explained is since “Autopilot’s goal is to abstract away the concept of a node and choose the right underlying configuration to support the SLA promised to customers. Based on our experience with GKE Standard the current combination of Linux + containerd is the best way to achieve this. We’ll continue to learn and expand the number of Kubernetes workloads we can support, but whatever we do it must meet the security and SLA standards we set for Autopilot.”
Sounds interesting? Autopilot is generally available today via the from the command line interface. Google is gradually rolling it out to the Google Cloud Console for all GCP regions. You can give it a try today with the free tier.