A Security Checklist for Cloud Native Kubernetes Environments
Kubernetes is climbing new heights day by day with faster adoption and more contributions from the community. One of the important aspects of Kubernetes which initially kept early adopters at bay, is security. Many doubted the security around containers and Kubernetes compared to VMs, and subsequently wrote containers off as a result. But slowly and steadily people are starting to believe that containers and Kubernetes are now as secure as physical and virtual machines.
Security in Kubernetes is a practice and not just a feature. Security is a multidimensional problem that must be addressed from many different angles. These multiple dimensions of security in Kubernetes cannot be covered in a single article, but the following checklist covers the major areas of security that should be reviewed across the stack.
Security must be a first-class citizen of any organization’s DevOps process (often referred to as DevSecOps). With DevSecOps, security concerns are embedded as part of the DevOps pipeline from day one. DevSecOps practices enable automation of most security concerns and provides a series of security checks during the development process.
The DevSecOps open initiatives spearheaded by Air Force Chief Software Officer Nicolas M. Chaillan is a great example of the adoption and importance of DevSecOps in an organization. The details were posted on the Air Force website, so they can be utilized by other organizations diving into DevSecOps. These DevSecOps practices can offer organizations a pathway for embedding security safeguards in their Kubernetes infrastructure.
Security in Kubernetes can be defined along 4 areas:
Securing the Infrastructure
Infrastructure-level security is often the most basic task, but also the biggest. Yet, often it is overlooked during the development process. It’s important to keep infrastructure security in mind while building the applications, as it impacts how the applications need to be architected.
Infrastructure security itself has many dimensions:
Kubernetes deployments are mostly microservices, where all the microservices are talking to each other or communicating with external applications and services. It’s important to limit networking traffic to only what is necessary, while understanding that microservices can be ephemeral and move between nodes in a cluster. You’ll need to consider the various aspects of network design in order to develop a secure network.
- Isolation of control traffic: Kubernetes control-plane traffic must be isolated from the data-plane traffic — not just for security reasons but also to avoid data traffic impacting the Kubernetes control-plane traffic. Without isolation, traffic from the data plane might overshadow traffic from the control plane and cause temporary service disruptions.
- Isolation of storage traffic: Similarly, storage traffic needs to be isolated from regular data and control traffic, so that the infrastructure’s storage service does not consume or bring down your application network or vice versa.
- Network segmentation: Kubernetes hides the underlying infrastructure from users. Developers should keep this fact, as well as multitenancy, in mind when designing the network. Underlying networking infrastructure must support both Layer 2 VLAN-based segmentation and Layer 3 VXLAN-based segmentation, to isolate the traffic between various tenants or applications. Both segmentation techniques are useful depending on the requirement.
- Quality of Service: In shared networking infrastructure, noisy neighbors are a big problem. It’s important that the underlying networking infrastructure can guarantee a specified service level to each pod or tenant, while making sure the traffic of one pod is not impacting the other pods. Network virtualization techniques like SR-IOV can be helpful to provide virtualized isolation on shared infrastructure.
- Network Policies, firewalls and ACL: We will talk about application-level network access control in more detail later, but networks should have lower-level access control at the hardware level, as well as better control over the traffic in a shared environment.
Storage is a critical part of security for any organization. Hackers usually look for confidential data, such as credit card information or personally identifiable information (PII), kept in application storage. Developers using Kubernetes should consider the following forms of storage-level security implementations.
- Self Encrypting Drives: One basic form of storage security is a self-encrypting drive. With these drives, encryption is offloaded to the disk itself, where data gets encrypted as it is written to the disk. This ensures that if someone gets physical access to the disk drive, they won’t be able to access the data.
- Volume encryption: In a shared infrastructure, Kubernetes CSI manages the lifecycle of the volume. This isolates the users from the underlying storage infrastructure. Volume Encryption ensures that individual volumes are secured against access from undesired elements.
- Quality of Service: In a shared storage infrastructure, an I/O-heavy application might impact the performance of other applications. It’s important that the underlying storage infrastructure has the capability to ensure guaranteed service level to each pod or tenant. Again, SR-IOV can be helpful to provide storage isolation at the PCI level with separate queues per tenant.
3. Host and Operating System (OS)
The next level of infrastructure is the physical or virtual host itself. Operators will want to secure the underlying foundation in the following ways:
- Harden the OS: Site Reliability Engineers (SREs) should secure the host OS following general security guidelines and harden the OS to avoid any further changes. SREs should also apply firewalls, port blocking, and other standard best practice security measures. Regular security updates and patches must be applied soon after they become available. Hackers and intruders often take advantage of known vulnerabilities.
- Enable Kernel security: Kernel security modules like SELinux and AppArmor define access controls for the applications, processes, and files on a system.
- Audit logging: Organizations using Kubernetes should implement audit logging not just to help monitor the systems, but also to help with debugging and finding the trails of security breaches.
- Rotate credentials: User credentials must be rotated frequently and must follow strict security guidelines to avoid being cracked or stolen.
- Lockdown the nodes: Once nodes are provisioned and set up in the Kubernetes cluster, the OS should be stripped down. There is no need to install or configure anything new, other than patches and upgrades. All the nodes must be locked down and should only be accessed by super admins.
- CIS conformance: The CIS (Center for Internet Security) provides a conformance test to ensure that all best practices have been implemented. Review your host set up and pass the conformance test to ensure compliance.
4. Host-Level Access Management
The weakest point for breaking into a Kubernetes cluster are the nodes themselves. As Kubernetes greatly isolates the user from underlying nodes, it’s important to control access to the node.
- Strict access: Organizations should be careful to limit root/admin access to the node to a very limited, trusted set of users.
- Establish lockdown: Even with non-root users, direct logins to developers should ideally be restricted and limited to access via a Kubernetes API server. To avoid any kind of threat on Kubernetes services running on nodes, all the nodes should be locked down.
- Isolate Kubernetes Nodes: Kubernetes nodes must be on an isolated network and should never be exposed to the public network directly. If possible, it should not even be exposed directly to the corporate network. This is only possible when Kubernetes control and data traffic are isolated. Otherwise, both streams of traffic flow through the same pipe, and opening access to the data plane means opening access to the control plane. Ideally, the nodes should be configured to only accept connections (via network access control lists) from the master nodes on the specified ports.
- Master nodes: Master nodes access must be controlled by network-access control lists, restricted to the set of IP addresses needed to administer the cluster.
With the infrastructure locked down, the next layer to secure is the Kubernetes installment itself. In a typical open source Kubernetes installation, many of these need to be configured manually since they are not all on by default.
1. Secure etcd
etcd is the highly-available key-value store used as Kubernetes’ backing store for all cluster data. It holds all the states, secrets, and information of Kubernetes — which means securing etcd is very important.
- As mentioned previously, nodes within etcd should be locked down with minimal access.
- Ideally drives containing the etcd data should be encrypted.
- Access to etcd must be limited to masters only.
- Ideally, etcd communication should be over TLS.
2. Securing access to Kubernetes cluster
Kubernetes allows for enterprises to use standard identity and access control solutions, but they need to be integrated with the environment and are not provided by default. Access controls can be broken down into the following components.
- Authentication: A user needs to be authenticated before they can access the Kubernetes API. Kubernetes provides various authentication modules — including Client Certificates, Passwords, Plain Tokens, Bootstrap Tokens, and JWT Tokens (used for service accounts). However, the actual user management and authentication is not part of Kubernetes. For production environments, organizations will need an external user management and authentication plugin or a Kubernetes platform that supports these capabilities. It is important to have integration with LDAP, Active Directory, or other identity provider solutions.
- Authorization: Once a user is authenticated (i.e. is allowed to connect to the Kubernetes cluster), the next step is authorization to determine access to the requested resources. Kubernetes supports multiple authorization modules, such as attribute-based access control (ABAC), role-based access control (RBAC), and Webhooks. RBAC is one of the most popular authorization plugins, as it allows granular control over the individual Kubernetes resources in a multitenant environment.
- Admission control: An admission control hook allows organizations to intercept and control Kubernetes requests, after the user is authenticated and authorized to access the requested resource. The best example of admission control is a resource quota, which lets organizations control resource consumption.
- Access to the Kubernetes API server must also be secured over TLS.
3. Security Policies
Kubernetes provides a few configurable policies that can be defined by the user. These should align to enterprise practices, but are not “on” by default.
- A Pod Security Policy is an admission control plugin that assures that pods are admitted only when following certain security guidelines. Policies that can be defined include limiting the creation of privileged pods, preventing containers from running as root or limiting the use of certain namespaces, networks or volumes.
- Network Policies are implemented by Container Network Interface (CNI) plugins, which control how groups of pods are allowed to communicate with each other and other network endpoints. It’s important to set network policies, because by default the pods are non-isolated (they accept traffic from any source).
- Kubernetes provides Quality of Service (QoS) guarantees for compute resources (CPU and memory) to avoid noisy neighbors or resource starvation problems, but it does not provide QoS for I/O (Storage and Networking). Hyper-converged platforms like Diamanti add support for QoS for I/O.
4. Workload Isolation and Multitenancy
In a multitenant environment, each tenant or tenant group must have a separate namespace to isolate the workload and data from each other. These separations and boundaries need to be supported by the CNI, CSI and authentication plugins, so that they are consistent across the stack.
Containers need to be secured as they are being developed and when they are running. There are many great resources available for securing containers, including this article, but here are a few of key elements:
1. Container Image Security
All running containers are based on an image file that can be downloaded from an open library like Docker Hub, or passed from one team to another. It is important to know where your images come from and what’s inside them. All of these initiatives should be part of the organization’s DevOps flow to automate and ensure image security.
- Image vulnerability scanning: Container images being built must be scanned for known vulnerabilities, with tools like Aqua, Twistlock, Sysdig and Clair. These tools parse through the packages and dependencies in the image, looking for known vulnerabilities.
- Image signing: Organizations should also enforce strict admission-control policies, to only admit images that are signed via corporate Notary. TUF and Notary are useful tools for signing container images and maintaining a system of trust for the content of containers.
- Limit privileges: Furthermore, organizations should avoid using a root user in a container image and prevent privilege escalation. Users inside of the containers must have the lowest level of operating system privilege necessary, in order to carry out the goal of the container.
2. Container Runtime
Container runtimes are programs installed in the operating system. The majority of environments use Docker today, where there is a CIS Benchmark available. Seccomp can be used to reduce the attack surface and newer runtimes like CRI-O have additional built-in security features.
3. Running Containers
Many tools such as Twistlock, Aqua, and Sysdig also provide continuous monitoring and threat prevention for runtime vulnerabilities, by monitoring network and system calls. These tools also have the capability to intercept and block these unwanted calls or communications and enforce security policies.
Finally, after securing the underlying infrastructure, Kubernetes and containers, it is still important to secure the application itself.
1. Application Access
- TLS for Kubernetes Ingress: The most common practice for exposing your application to outside of the cluster is using an ingress controller like Envoy or NGINX. All external access to ingress controllers must be over TLS, and communication between the ingress controller and application containers should use TLS as well, although there are cases where that is not needed – depending on the network design and corporate security policies.
- Encrypt everything in transit: With the exception of a few cases, the default behavior should be to encrypt everything in transit. Even behind the corporate firewall it is advisable to encrypt network traffic between containers. Many service meshes like Istio and Linkerd provide an mTLS option to auto-encrypt the traffic within the Kubernetes cluster.
- Networking: Service meshes like Istio, Linkerd and Consul provide many Layer 7 networking features, allowing the restriction and control of traffic between multiple tenants.
- Ports: It’s important to only expose the ports on your application/containers that are absolutely essential for communication to that application.
3. Application Hardening
Many DevOp practices should be built into the CI/CD pipeline to ensure that applications are secure and follow best practices. Some examples are:
- Analyze source code regularly to ensure it is following best practices to avoid vulnerabilities and threats. There are many tools available, like Veracode and Synopsys.
- Most developers rely on third-party applications and libraries to build their applications and microservices. Regularly scanning code dependencies for new vulnerabilities ensures that they are not a threat to the security of your application.
- Continuously test your application against common attack practices, like SQL injection, DDoS attack, etc. There are various dynamic analysis tools available to assist here.
Security is always the top concern for organizations. But traditionally, security has been a separate team in an organization that works in its own silo, away from the development process. Developers usually focus on the application and the security team gets involved towards the end of the development cycle. But this can derail deployments, as the security team holds up the process due to development practices that ignored key security policies. This unhealthy interaction between security and development teams causes not just vulnerable software developments, but it also leads to many last-minute bugs and unexpected delays in production.
In the new age of containers and Kubernetes, it is important to have robust automation of security practices; and security should be integrated into the development cycle from the beginning. DevSecOps is now the focus, as security becomes ingrained in the DevOps process. The challenge is that many of the items outlined in the above checklist must be manually configured across a multitude of domains. Missing just one of the items can put your entire application and company at risk.
Application security remains the responsibility of your developers. But the other security features relevant to infrastructure, platform and Kubernetes can be addressed via a modern hyper-converged approach like the Diamanti platform. The Diamanti platform is a full-stack hardware and software platform for Kubernetes, which has built-in many of the security features mentioned in this post, thus alleviating the pain of implementing them yourself for your organization. This helps you easily set up your DevSecOps pipeline, so that you can focus on application development.