No More Forever Tokens: Changes in Identity Management for Kubernetes
Workloads running on Kubernetes need an identity that lets them connect to external services and APIs, as well as other services in the cluster or workloads in different clusters. Identities let CI/CD systems connect into clusters and they’re used for secrets management. That’s all done with service accounts but there are some security and scalability issues with the way these currently get tokens for identity.
The changes being made to improve security and manageability should be transparent to many applications. The goal is to create a workload identity system that can provide identities to apps with very little developer effort but also minimize the impact of any credential leaks so attackers can’t get broad access to the system if they do obtain a token. It will also be something that other identity management tools and projects can build on.
But as this new approach will become first the default and then the only option in future releases, it’s worth testing with the alpha release to find out if you’ll need to update permissions, change your cluster specifications or make more significant changes to your processes.
No More Forever Tokens
When a Kubelet starts a pod that will run as a service account, it requests a JSON Web Token from the Kubernetes apiserver. Currently, those JWTs are “forever” tokens; they don’t expire and are valid for as long as the service account exists. That means if credentials leak, replacing the tokens means creating a new service account and deleting all the secrets for the old service account. Service account signing key rotation is possible, but it’s not supported by client-go or automated by the control plane, so it’s not widely used.
When the Calico Container Networking Interface (CNI) plugin was found to be leaking credentials into logs, moving to new credentials by replacing service accounts was painful even for experienced Kubernetes operators.
Tokens are also stored as a Secret (which means every service account requires a Kubernetes Secret) and because of the way Kubernetes manages secretes, any component that has permission to see one of a service account’s secrets can see all of them. The egress controller needs to read TLS secrets, but that means it can also read the service account credentials for every application in the cluster. The new tokens are handled by a projected volume.
Unlike the existing tokens, the new tokens don’t rely on secrets and they’re audience bound to the apiserver, so you can set up a specific token for communicating with a secrets management tool like Hashicorp Vault (which uses tokens to integrate with Kubernetes) and be confident that nothing else can use that token.
The other big change is that tokens will now automatically expire: app developers and pod specification authors can set token expiry lifetimes, but the cluster admin can set a maximum lifetime for tokens across the whole cluster.
Those lifetimes aren’t expected to be long; hours rather than days or years. The Kubelet handles refreshing tokens; it has a token manager that knows the lifetime of each token and starts requesting a new token from the apiserver when then token is over 50 percent of its time to live (or over 24 hours old).
Tokens are also invalid once the pods they were provisioned to are no longer running, so revoking or rotating tokens without waiting for them to expire just means restarting the pod by redeploying the application or triggering a rolling update.
In the future, the tokens could also be hardened by being bound to a key that’s stored in a Trusted Platform Module (TPM) so they’re hard for attackers to get at.
There are still areas where more work on authentication needs to be done, like the areas where identity and access control for Kubernetes depends on add-ons and third-party tools. The Kubernetes database uses etcd as persistent storage for cluster data and stores secrets as base64-encoded plaintext; although etcd has had an authentication mechanism since version 2.1, it’s off by default for backward compatibility.
The latest release of Kubernetes, version 1.13, adds the option to encrypt API data in the etcd key-value store (used by Kubernetes to track pod information), but you have to enable it. “By default, the secrets in Kubernetes are not encrypted,” HashiCorp founder Mitchell Hashimoto told us, noting that the company’s Vault product is popular for this. “We get a lot of organizations asking us how do we make the apps our developers are building more secure by default. They want to shift behavior away from developers not knowing whether they should encrypt and just encrypting by default.”
Server authentication for cluster workloads is also handled by projects like Istio and SPIRE which performs node and workload attestation using the SPIFFE framework and APIs, but there isn’t a native way of doing that today. Expect to see node-scoped authorization based on the new tokens but also extensions that SPIRE and Istio can hook into, to take advantage of the new token flow.
“We want to enable secure, featureful identity solutions applicant on top of Kubernetes that can leverage the chain of trust that Kubernetes sets up to provision workloads,” Mike Danese, chair of the Kubernetes Auth Special Interest Group (SIG), said at Kubecon+CloudNativeCon 2018 event last year.
Changing identity and authentication options can cause problems for existing deployments, so the feature is being introduced gradually. It’s in Kubernetes 13.1 as an alpha feature for testing, but it’s turned off by default with a cluster-level flag to opt-in (unless you’ve already opted into taking all alpha features). In Kubernetes 1.15 it will likely be in beta and on by default, and when it goes GA (currently planned for Kubernetes 1.16), you’ll no longer be able to opt out — so now is a good time to start testing what the impact of these new service accounts will be on your clusters.
The service account volume location and file structure don’t change with the new token system, but the BoundServiceAccountTokenVolume feature gate migrates it to a projected volume so if the cluster’s PodSecurityPolicies only allow secret volumes you’ll need to permit projected volumes for new pods. The permissions for tokens are also much less permissive, changing from 0644, which gives broad access to all users, to 0600. If you’re running an app as a user other than root, you may see file permission errors; use the fsGroup feature to add a supplemental group to give the app permissions to read it. That only changes the file permissions to 0640; the individual app that’s running as an unprivileged user can still access the credential but other apps on the node can’t read the token.
Tokens will need to be re-read periodically; Kubernetes client libraries handle re-reading the tokens and the Go, Java and Python client libraries that are maintained as part of core Kubernetes will support this and just need updating, but any apiserver clients in the cluster that don’t reload service account tokens will start failing an hour after deployment. Old client libraries will have problems and pre-1.11 Kubelets won’t run new pods that mount service account volumes once you opt-in.
Injecting secrets into environment variables to pass credentials to applications isn’t supported with the new token flow, because those environment variables are too likely to end up in logs, potentially leaking the credentials.
There may be namespace and service level opt-outs to help with migrating to the new system, by marking service accounts that need to use legacy token provisioning, for example when you’ve exported a key from a service account to use in a CI/CD system. But don’t rely on that opt-out being around for a long time; you will need to migrate to the new token system in time.