How to Lockdown and Secure Kubernetes Persistent Volumes
NetApp sponsored this post.
Ensuring only authorized applications and users can access Kubernetes volumes provisioned by NetApp Trident is obviously a paramount concern — it’s also one of the first deeper conversations we tend to have with anyone planning a deployment.
The good news is Kubernetes and Trident work together to provide highly secure persistence, provided that you follow these guidelines:
- Wall off access to volumes in Kubernetes by creating namespaces that define your trust boundaries.
- Prevent pods from accessing volume mounts on worker nodes by creating an appropriate Kubernetes pod security policy.
- Restrict volume access to appropriate worker nodes by specifying a security policy through Trident that is appropriate for each backend.
We are often asked about security — below are some of the most common questions. However, if you have others, please reach out to us using the comments for this post, Slack or any of our other communications channels.
Can access to a persistent volume be restricted to one pod/container?
Persistent Volumes (PVs) managed by Trident are created when a Persistent Volume Claim (PVC) is submitted by the application. This triggers Trident to create the volume on the storage system. However, PVs are global objects and PVCs belong to a single namespace. Only an administrator or Trident (because of the permissions granted to the service account it is using) is able to manage PVs.
Why is this important? Namespaces are logical boundaries for most resources in Kubernetes. They are a security domain, with the assumption that everything in a namespace can access everything else within it. However, a user or application is prevented from using resources in a different namespace. For example, a pod in one namespace cannot use a PVC in another, even if the user has access to both.
Additionally, a PV that is already bound to one PVC cannot be bound to another, regardless of namespace. This means that even if a user attempts to craft a PVC which claims an existing PV from a different namespace, it will fail. When using Trident, the PV and PVC are destroyed at the same time by default. This behavior can be changed so that PVs are retained, but a PV that was bound to a PVC once and then unbound can never be bound again.
So, to answer the question: no, an individual PV/PVC cannot be limited to a single pod. However, PVCs are limited to a single namespace in the same way that other resources are.
Can a pod see other volumes mounted to a host, and/or see what storage is presented from the array?
If a user in a pod were to execute the “showmount -e” command, or the iSCSI equivalent, against the storage system providing volume to the Kubernetes cluster, they are able to see the list of exports. However, as was stated above, they cannot gain access to another volume from inside a pod.
In order to mitigate this situation, the storage system volume access control policy, whether igroups, volume access groups or export policies, should be restricted to only nodes in the Kubernetes cluster. This prevents mounting the volume from hosts outside of the Kubernetes cluster and bypassing the security controls in place. Additionally, disable “showmount” functionality for the SVM.
Can pods on the same node, but from a different namespace, gain access to a mounted volume?
No, with one exception: privileged containers. The process in the pod/container on the Kubernetes node does not have the ability to see resources on the system other than what they have been assigned. This is also the core Linux namespace functionality used by all containers. A user, or an application, does not pose any threat to other volumes by issuing fdisk or mount commands.
Will creating a volume with a specific UID and GID help protect the data?
No, it really won’t provide additional protection for Kubernetes-based applications. The assumption here is that the volume has the userID (uid) and groupID (gid) specified and the Unix permissions are set to something like “700” (Note: Trident does not support setting uid and gid but does allow Unix permissions to be customized). Additionally, the pod is using a security context that specifies matching uid and gid values.
Logically, this means that because the uid/gid of the process and the volume all match, access is granted. If the uid/gid doesn’t match, then, even if the volume were mounted, the pod would not be able to access the data. Kubernetes also enables the administrator to limit a namespace to specific uid and gid values to prevent the user in a namespace from attempting to use another namespace’s user information.
So, why doesn’t that protect the data? NFSv3 assumes the client has done the authentication/authorization. The values can also be arbitrarily specified and no validation is completed by the NFSv3 server. This means that any pod (in the same namespace) could use the uid/gid associated with the volume and gain access.
Kerberos could solve some of these issues because the NFS server participates in the authorization process, thus ensuring that only a validly authenticated user, with authorization, is accessing data. However, Kerberos is not supported by Kubernetes except for user authentication when using a proxy.
Security Happens at All Layers
A user who has through whatever means, elevated their access on a Kubernetes node to full root has the ability to mount, manipulate and/or destroy resources in many ways. It is thus critically important to secure your cluster, both Kubernetes itself and the underlying host OS of which Kubernetes is on top. The cluster is thus also secured in the same manner you would complete the process for a hypervisor management console or other critical systems. A good place to start is always the STIG (no, not that Stig) for your operating system.
Using namespaces to provide isolation between security domains, whether that be an application, a team, or something else, is “good enough” for many use cases. This is especially true when the host OS, Kubernetes and the storage system have been configured to limit access to the storage devices and additional metadata they might contain. If you want ultimate protection between applications deployed to Kubernetes, having multiple clusters with dedicated resources provides the most robust separation.
We know you will have more questions about things that concern you. We haven’t covered every possible scenario, and probably never will, so please use the comments below or reach out to us on our Slack team, GitHub issues or open a support case. We’re happy to help!
Feature image from Pixabay.