How to Build The Right Platform for Kubernetes

Kubernetes is an orchestrator. It’s how you deploy, network, load balance, scale and maintain containerised apps. And each of those workloads has its own architecture, whether that’s stateful or stateless, a monolith that you’ve containerised or microservices that you use with a service mesh, batch jobs or serverless functions.
But you also need to think about the architecture of your Kubernetes infrastructure itself: how you build the platform where Kubernetes runs.
Kubernetes is flexible enough to deploy almost any kind of application on almost any kind of hardware, in the cloud or elsewhere: in order to be both that generic and that powerful, it’s extremely configurable and extensible. That leaves you with a lot of architectural choices to make.
These choices include whether you make all the individual configuration choices yourself, follow the default options in tools like VMware Tanzu or Azure Arc that offer a more integrated approach to deploying and managing infrastructure — or go with a managed cloud Kubernetes service that still gives you choices about the resources you deploy but will have quick starts, reference architectures and blueprints designed for common application workloads.
Planning Kubernetes Resources
Your Kubernetes infrastructure architecture is the set of physical or virtual resources that Kubernetes uses to run containerized applications (and its own services), as well as the choices that you make when specifying and configuring them.
You need to decide what virtual machines (or bare metal hardware) you need for the control plane servers, cluster services, add-ons, clusters, data store and networking components, how many nodes you need on your cluster and what memory and vCPU they should have based on the workload and resource requirements for pods and services.
Autoscaling lets you adjust capacity up or down dynamically, but you need to have the underlying capacity available. You need to think about the best platform for hosting your Kubernetes clusters: infrastructure in your own data center, at the edge, with a hosting provider, or in a public, private or hybrid cloud.
Some of that will be dictated by the needs of your workloads: if they’re primarily stateless (or if it’s easy to store that state externally), you can keep cloud costs down by using spot instances that are deploy discounted but might also be interrupted suddenly. You need to know something about the size, complexity and scalability of the applications you plan to run and the amount of control and customization you’ll need, as well as factoring in the performance, availability and cost of the resources you’ll be using.
Originally, Kubernetes was built with the assumption that all the hardware it was running on would be fundamentally similar and effectively interchangeable, because it was developed to take advantage of the commodity servers common in cloud Infrastructure as a Service (IaaS).
But even in the cloud, different workloads still need very different resources and Kubernetes has evolved to support much more heterogeneous infrastructure: not just Windows nodes as well as Linux, but GPUs as well as CPUs, Arm processors as well as x86. There is even the option to use certain classes of Linux devices as nodes.
If you’re using cloud IaaS for your Kubernetes virtual machines or a managed cloud Kubernetes service like AKS or EKS, you can choose the appropriate instances for your VMs. If you’re building your own Kubernetes infrastructure at the edge, you might pick Arm hardware or consumer-grade Intel NUCs to run a less demanding Kubernetes distribution like k3s in a restaurant or retail store, where you don’t have the facilities for data-center grade hardware.
Depending on the Kubernetes distribution you choose, you may also need to think about the host OS you want and which container runtime you’re going to use. Will you run your own container registry or only pull images from public registries? Where will you store secrets? Using HashiCorp Vault or a managed key store from your cloud provider means you won’t have credentials in your deployment pipeline where they might leak.
Multi-Cluster K8s Infrastructure Architecture
You also need to think about possible failure: do you need highly available clusters that run multiple replicas of key control plane components or will you be running a multi-cluster architecture?
For smaller Kubernetes infrastructure, you can separate different workloads using namespaces: logical partitions that let you isolate and manage different applications, environments and projects on one cluster. But you can also use a single Kubernetes control plane to manage multiple clusters of nodes, putting workloads on distinct clusters for better security and performance.
If you have regulatory requirements or strict limits on what latency is acceptable, need to enforce different policies and permissions, or want to avoid a single point of failure for an application that requires zero downtime, this lets you orchestrate applications in different locations – including on different cloud providers – but still have one place to access that infrastructure. That simplifies migrating applications from cluster to cluster, whether that’s for scaling or disaster recovery, although it also introduces significant complexity.
Networking your Kubernetes Infrastructure
You also need to plan your service discovery options and network topology, including the firewall and VPN connectivity, as well as the network plugins, DNS settings, load balancer and ingress controller for the cluster.
Think about access management: you will need to deploy role-based access control (RBAC) to enforce fine-grained permissions and policies for your users and resources, and make sure you’re securing admin access. But you also need to manage machine identities for workloads that need access to existing data stores.
Native Kubernetes user authentication uses certificates: if you need centralized control and governance for user access, you will probably want to use your existing identity provider for authentication.
Architect for Managing Kubernetes
Since Kubernetes is built to make it easy to scale applications, while you can make manual changes to individual settings like liveness and readiness probes, it’s really designed for declarative configuration management. You write configuration files in YAML (or use a tool that emits those for you) to tell Kubernetes how an application should behave, and Kubernetes handles making that happen.
Instead of tweaking settings, you should focus on automating for repeatability using Infrastructure as Code: set up the configuration as version-controlled, auditable code and apply it as often as you need (or restart it if there’s a problem), getting the same system every time.
Repeatable, immutable infrastructure where you treat clusters as cattle (rather than pets that you name and hug and care about individually) is what Kubernetes is designed for. Preparing for that is how you reduce the effort of ongoing management and actually operating containers in production.
You can extend this to policy management and governance as well as application delivery using a GitOps workflow with Flux or Argo CD that deploys application updates and keeps clusters in the desired state all the way from bootstrapping to configuration updates. You’ll want to collect metrics and track performance: most workloads emit Prometheus metrics but you’ll also need to think about a monitoring dashboard and what logging you want to enable.
You’ll need to monitor your container infrastructure for threats and security risks, as well as making sure your VM hosts are appropriately hardened. Again, thinking about the tools and processes you’ll use for that while you’re planning your Kubernetes infrastructure architecture will make it easier to make sure you don’t miss anything.
Understanding Kubernetes Architecture
Putting all of this together isn’t trivial and you can learn a lot from how other Kubernetes users have structured their infrastructure architecture.
“You’re trying to acquire eight years of Kubernetes development before you can be productive with it. That’s too much to ask. You need an almanac that helps you navigate and avoid the icebergs,” cautioned Lachlan Evenson, former Kubernetes release lead and steering committee member. Evenson co-authored “Kubernetes Best Practices” with Kubernetes co-founder Brendan Burns in an attempt to provide a companion guide to offer some of that.
But you should still expect to spend time figuring out what infrastructure architecture will best suit your particular workloads and acquiring the expertise to run it.