How Low Can Kubernetes Go? Rancher, Arm Team to Find Out
Rancher Labs has teamed up with Arm to bring Kubernetes management to Arm-based clusters running on edge and data center nodes. It’s basically a matter of figuring out how small Kubernetes can go, according to Sheng Liang, CEO and co-founder of Rancher Labs.
The move, combining Rancher Kubernetes Engine (RKE) and RancherOS with Arm Neoverse-based servers, provides enterprises a single way to manage Kubernetes in x86 clusters as well as their IoT deployments. The two companies are working to deliver the Kubernetes-based solution for Smart City projects in China.
IoT projects typically involve low-power edge devices that run in a camera or air-conditioning system. Edge nodes of 5 to 8 gigabytes function as an IoT controller, which do quite a lot of data processing to reduce the amount of data sent back to the data center. Users send applications out to these nodes, which get updated fairly frequently. They have Kubernetes running on these data aggregation and data ingestion apps, then they have larger Kubernetes clusters running in the data center, Liang said.
Rancher Kubernetes Engine (RKE) is a lightweight installer that packages Kubernetes in Docker containers. It removes the dependencies that standard installers have on the underlying infrastructure and makes installing and upgrading a cluster fast and easy.
The biggest challenge for retooling for Arm servers has been that the edge nodes often don’t have stable network connectivity. The nodes can’t form a cluster unless they’re right next to each other. They’re typically not right next to each other, so they end up being single-node clusters, Liang said.
“Kubernetes is designed to have a fairly tightly coupled system. If it doesn’t hear from a node for a while – 10 minutes, 20 minutes — it’s designed to try to get that running again. We tried turning all that stuff off, but finally realized it’s actually easier to run all these as single-node clusters. I think we’re pushing the envelope a little bit to see how much smaller Kubernetes can shrink,” he said.
The work is ongoing for IoT use. It has involved resource optimizations, like taking out data center-oriented features such as load balancing.
“To be honest, Kubernetes really struggles on any node lower that 8 gigabytes,” he said. “If you have 8 gigabytes and it takes 2 to run Kubernetes, you still have 6 left. But if you start with 4 gigabytes, half the resources are in use before any useful workload is deployed. So we’re in the process of shrinking the footprint.”
The Rancher Kubernetes-based platform for IoT and edge nodes, which includes an Arm port of Rancher 2.1, introduced in October 2018, and RancherOS 1.5, will be generally available early 2019. In November, the company announced availability for its Kubernetes platform on China’s three largest cloud providers.
Rancher also has made the monitoring tool Prometheus multi-tenant, something clients have been requesting for over a year, according to Liang.
“If you have multiple people [or groups] sharing a Kubernetes cluster, Prometheus itself is actually single tenant. So if you log into Prometheus, I’d be able to see all your stuff. People don’t like that,” he said. “A lot of people had each person starting their own Prometheus deployments, and that has challenges — the need to manage these things. And that’s wasteful. It caused a lot of duplication. It was just very, very resource-heavy.
“So we created a secure proxy in front of Prometheus. We didn’t fundamentally change Prometheus to make it multitenant, but we were able to capture all the queries into Prometheus’ time-series database, then depending on the person or the project it was for, we were able to instrument those queries so that if I belong to [certain] namespaces, I would only be able to see the metrics for those namespaces.”
It works this way:
- Rancher deploys a Prometheus operator into each new cluster.
- The cluster-wide Prometheus deployment is used to store cluster-level metrics such as node CPU and memory consumption, as well as project-level metrics collected from applications deployed by individual users.
- The project-level Grafana talks to Prometheus through a secure proxy, which instruments PromQL statements to ensure only the namespaces belonging to the user’s project are included in the query.
Feature image via Pixabay.