How CERN Accelerates with Kubernetes, Helm, Prometheus and CoreDNS
CERN, the European Organization for Nuclear Research, is known for its particle accelerator and for experiments and analysis of the properties of subatomic particles, antimatter and other particle-physics-related research. CERN is also where the World Wide Web (WWW) was created.
All told, CERN now manages over 500PBs — over half of one exabyte — which, in a decade’s time, is expected to total 5,000PBs as a new accelerator goes online, said Ricardo Rocha, a staff researcher at CERN.
In this episode of The New Stack Analysts, we learn from Rocha how CERN is adapting in the next few years to manage 10x the data it manages now.
Alex Williams, founder and publisher of The New Stack, hosted the podcast with co-hosts Cheryl Hung, vice president of ecosystem at Cloud Native Computing Foundation (CNCF) and Dave Zolotusky, senior staff engineer at Spotify.
Kubernetes plays a big part in CERN’s infrastructure. For the management of its clusters, CERN makes use of its own private cloud on-premises services OpenStack provides. By deploying its Kubernetes clusters on top of OpenStack, CERN has “an experience that is very similar to what public clouds also offer,” said Rocha. This involves integrating its Kubernetes clusters with the cloud provider for autoscaling and its storage systems. OpenStack also helps CERN manage its legacy systems.
“What we offer our users is very much an experience of Kubernetes-as-a-service,” said Rocha.
While it might be tempting to allow users to migrate to Kubernetes with tools and scripts as they wish, Helm can serve to add discipline and tighter control for the process. CERN offers internal training to onboard users to migrate their workloads on Helm, Rocha explained.
“When we first had people being introduced to Kubernetes, it was very tempting just to get things running,” said Rocha. “But we needed more to make sure that things were maintainable… Helm came as a very good option.”
Helm is also playing a role in CERN’s adoption of GitOps and the ability to ensure that “well-structured definitions are kept in git,” while removing the need for users “to directly touch the clusters,” said Rocha. “This is the next step in our Helm pathway,” said Rocha.
Adopting the right monitoring tool is also critical. For monitoring, CERN largely relies on Prometheus for its cluster metrics, which is integrated with clusters by default. Users can also add their own preferred custom metrics as well. “This is pretty standard for Prometheus-based deployments,” Rocha said.
CERN’s infrastructure is centrally managed with few isolated tenants and data environments within its network layers. Its DNS infrastructure management is also centralized and managed by a dedicated team.
With CoreDNS, CERN’s DevOps team did struggle with its initial outlay. “We had some issues making it scale to the amount of requests we would have inside the cluster. But we got to a point where it basically autoscaled to the amount of instances we needed,” said Rocha. “And, [the issues] basically disappeared. It’s not a subject we ever touch any longer, which is a good sign.”