Q&A: Building on 40 Years of Experience with Distributed Systems at Nokia

On the cloud native journey, there are both general lessons and best practices that apply to nearly all companies as well as industry-specific challenges. Cloud native journeys aren’t one-size-fits-all; the best way to handle storage, networking, security and even back-ups depends on the specifics of both the industry and the individual company.
We spoke with Gergely Csatári, senior open source specialist at telecommunication equipment giant Nokia, about the specific challenges faced by telecom companies as they adopt containers and Kubernetes and how the industry is addressing them. Here’s what he had to say about best practices both for other telecoms as well as everyone making the move to cloud native.
Can you give an overview of Nokia’s cloud native journey? When did it start, how far are you in maturity? What have been major turning points/landmarks on the journey?
Nokia, as with many other telecom infrastructure vendors, has a long history in implementing and utilizing massively distributed systems. We started to build our own cluster management system back in the 1970s, which was based on proprietary hardware and software.
As network function virtualization (NFV) technology gained momentum in 2016, we began offering infrastructure solutions and virtual network functions (VNFs) for the ETSI NFV Management and Orchestration (MANO) standard, leveraging OpenStack. Today, Nokia’s CloudBand MANO solution is deployed in hundreds of cloud deployments, encompassing tens of thousands of servers, across more than one hundred service providers around the globe.
Today, we develop all our products using cloud native principles, such as being infrastructure agnostic and using open APIs, and we continuously re-architect our portfolio to be Cloud Native Network Functions.
In late 2017, we started to explore Docker containers as a means to deliver the same software stack for every telecom operator regardless of the infrastructure they wanted to run their network on, including OpenStack, bare-metal servers, and public cloud platforms. Not long after getting started with Docker containers, we became impressed by the simplicity of label-based scheduling in Kubernetes and started to use it in pre-1.0 versions.
Our first cloud-native, Kubernetes-managed telecom product, the Nokia Telephony Application Server, went live in early 2018. Today, we develop all our products using cloud native principles, such as being infrastructure agnostic and using open APIs, and we continuously re-architect our portfolio to be CNFs or Cloud Native Network Functions.
Kubernetes and cloud native technologies were developed primarily for IT workloads and environments, and we found their networking capabilities fall short of telecom needs. To address this issue, Nokia developed a proprietary networking plugin that we transformed into a CNI plugin and open sourced later as project DANM (Damn, Another Network Manager!) which is available on Github.
Currently, we are rethinking the way telecom applications and application suites should be managed in the cloud native world. Our approach is to rely on the orchestration capabilities of Kubernetes instead of external entities and use only two levels of orchestration as opposed to the traditional MANO stack.
What have been the most surprising aspects of moving to cloud native? What has been surprisingly difficult? What has been surprisingly easy?
Using Kubernetes’ cluster management capabilities instead of our proprietary solution was surprisingly easy. On the other hand, we are a bit disappointed in the level of support for networking in Kubernetes. As telecom vendors we need to have multiple interfaces per Pod and, to achieve this, we built CNI multiplexer capabilities into DANM.
The biggest challenge currently is the differences between the Kubernetes instances, which can cause integration difficulties; in some cases requiring changes to our deployment artifacts, eroding the deploy-anywhere ethos of the cloud native world. The Cloud iNfrastructure Telco Task Force (CNTT) is working on the standardization of Kubernetes environments for telecom workloads, and we expect these problems will diminish with time.
What open source technology do you rely on? Can you give an overview of the tech stack (both open and closed source)?
We offer Nokia Container Services, our telecom-optimized Kubernetes distribution as container infrastructure. It includes Kubernetes, DANM, CPU Pooler, and several of other infrastructure projects.
On the application level, Nokia employs a centrally managed software infrastructure catalog with a wide variety of more than 60 cloud native software components that our products are built on, including Prometheus and Kubeflow.
What advice do you have for other companies on the cloud native journey?
Always rely on upstream solutions and always introduce changes to the upstream projects instead of forking.
Feature image via Pixabay.