How to Cut Through a Thicket of Kubernetes Clusters
Kubernetes clusters and microservices have opened up a plethora of new possibilities for developing and running modern applications, and they bring many advantages including distributed architecture, increased redundancy, high availability and nondisruptive upgrades. But as with most things, those benefits come with challenges.
Due to the nature of Kubernetes and increasing interest in the technology in general, there are more and more options for running it. Today, we can get Kubernetes in a public cloud or on premises. If we go to the public cloud, we can choose from one of the many hyperscalers, or even more than one, that have their own offering to provide us with Kubernetes clusters. We also have a variety of choices to get it on premises, such as VMware Tanzu, Openshift, Rancher and many more solutions.
But why not install Kubernetes clusters on our own from scratch so that we have everything under our own control? The reason is pretty simple: It’s time consuming and not an easy task. If you’ve never tried, I’d recommend checking out Kelsey Hightower’s “Kubernetes the Hard Way.”
Because of this, many organizations use either an offering from a hyperscaler or one of the on-premises options. For a mid- to enterprise-size environment, it’s common to go with a combination of the two. This multicloud approach helps avoid being locked in with a particular solution or vendor, and it’s also a way to build redundancy and resilience into an infrastructure.
As platform engineers, we need to manage and maintain dozens, hundreds or even thousands of Kubernetes clusters using different platforms and solutions — what is often described as Kubernetes cluster sprawl.
That might not sound too scary, until you start thinking about that management. How can you ensure that these clusters are conformant and follow security standards, especially if your organization is bound with some security regulations?
Think about access, resource, security and network policy management, image restrictions enforcement, as well as package and Kubernetes life-cycle management.
Defining a YAML with some policies and applying them to a single cluster might not sound like a huge challenge, but doing it at scale — tens or hundreds of times, where different clusters should be configured with slightly different policies — quickly gets more complicated.
This degree of management requires a mindset change, especially if you’ve got roots in more traditional infrastructure management and perhaps have only a couple of big hypervisor clusters hosting your virtual machines.
Current Challenge: Manage Diverse Kubernetes Clusters
So how should one go about managing all these clusters on different platforms? That’s a question I hear frequently from colleagues who are platform engineers, and it’s also a challenge I’ve been dealing with in my own organization.
Every Kubernetes cluster provides basic resources that could be used to define, for example, policies. Let’s consider network policies for a moment. I can create a YAML and apply it to any cluster with a simple automation. Pretty easy, right? However, it is easier said than done.
First, I need a YAML that will define the config. Not a big deal. Even if I’m not too confident with a YAML manifest, I can still use some tools like the network policy editor provided for free as part of the Cilium project. But how do I group my clusters, ensuring the proper YAML was applied on the correct cluster? And most importantly, how do I ensure that my clusters are still compliant with the configuration we’ve previously defined?
With network policies, we have an editor that could be used. What about other configurations, such as role-based access control (RBAC), security, etc.? There are some dedicated tools and editors we can use, but this isn’t a perfect solution if we cannot assign a huge team of people to take care of it.
Fortunately, key players in the Kubernetes landscape noticed our challenges and provided more comprehensive solutions to help relieve the pain and save time, increase productivity and standardization, and decreasing time to market. Some of the available options we can choose from today include:
- VMware Tanzu Mission Control
- Google Anthos Config Management
- Azure Arc for Kubernetes
- Rancher Server
- Advanced Cluster Management for Kubernetes (provided by Red Hat OpenShift)
That’s not a complete list, which doesn’t make the choice easier. How should leaders choose the best one for their organization? It might be partially based on some personal and corporate preferences or the fact that they are already using a specific vendor’s platform.
It should go without saying, but such a decision should always be made based on an organization’s specific set of requirements. We don’t buy a product just for the sake of buying a product or because it’s “nice and shiny.” 😊
However, when trying to identify the best solution for your environment, there are a few features that can be lifesavers, or at least time savers that eventually affect the bottom line:
- Diverse Kubernetes cluster management: Try to avoid solutions that are limited to a single platform. Just because you use a single platform today doesn’t mean you won’t be using others next year.
- Policy-driven management: A product should provide a relatively straightforward option to define policies, preferably without deep experience with YAML manifests. Some of the most useful configurations that can be managed via policies could be related to, but not limited to, network (firewall rules), security, image management, RBAC, resource quotas, etc.
- Life-cycle management: Being able to easily upgrade your clusters with newer versions of Kubernetes at scale is important if you consider how frequently new releases are becoming available.
- Package management: There are plenty of additional components you might need to get installed on your Kubernetes clusters. A feature that lets you install them remotely in a centralized way is a must-have.
- Cluster group management: Look for the ability to define various structures to group clusters and namespaces based on the environment type, its criticality, service-level agreement or any other factors applicable to your organization. This might be related to the concept of multitenancy, but it doesn’t have to be.
Additional capabilities and features could be considered a plus, but in my opinion, these are the most important ones to look for to help streamline Kubernetes cluster management in multicloud environments, even with a really huge scale.