Cloud Native / Networking / Contributed

Re-Imagining the Network for the Cloud Native Era

31 Dec 2019 9:00am, by

Vijoy Pandey
Vijoy Pandey is the VP and CTO of Cloud at Cisco, having joined Cisco in August 2018. Vijoy was previously at Google where he has held various leadership roles in the architecture, engineering and operations of Google's global data center networking footprint, Cloud networking, and their two global WAN networks. He also led the development of software and systems for intent-driven zero-touch automation, diagnostic telemetry, data analytics and ML/AI and application-level awareness in the infrastructure. Prior to Google, Vijoy served in numerous CTO capacities including CTO of Networking at IBM Cloud and at IBM Systems and Software Group; CTO of Blade Network Technologies, and has led global engineering teams at Blade Network Technologies, Nortel and Alteon. Vijoy has led the industry’s automation and data analytics efforts for cloud-scale networks, and was instrumental in delivering many industry firsts — including the first intent-driven e2e automation framework at cloud scale, the first Open Source SDN controller; the first VM-aware switch, and the first low-latency HFT/HPC switch. He has a Ph.D. in Computer Science, and holds over 60 patents in distributed systems and networking.

In the last few years, we have seen application architectures evolve dramatically and become cloud native. Monolithic applications are being broken down into microservices and serverless functions to exponentially ease development, lifecycle management, increase the velocity of features, and improve the availability of the services offered.

The network has been quite slow to evolve and has taken a few twists and turns in its journey to becoming cloud native and fit seamlessly into the architectural, development, and deployment model of cloud native applications.

But why should an application developer even care about the network, and its evolution?

No one writes a monolithic, siloed, single-compute application anymore. Cloud native is a synonym for scaled-out distributed applications. And a well-behaved distributed system is synonymous with a capable, well-abstracted, highly available and secure network.

You cannot develop a successful cloud native application without paying attention to the characteristics of the network. While we all might want a network that is homogeneous, zero latency, always available, infinitely secure, and has infinite bandwidth everywhere, the Fallacies of Distributed Computing treatise highlighted back in the early 1990s are as valid today as ever, and even more relevant today as applications become cloud-native.

That is not to say that those attributes of the network have not seen dramatic advancements since the 1990s. Advances in optics, wireless, switching and routing silicon technologies have ensured that in the last two decades bandwidth has gone up by a million times and latency has dropped down by more than 40 times. With formal intent-based modeling of networks (IBN), and AI/ML infused automation systems, the availability numbers are also creeping slowly and surely towards 4-9s and in some environments, even towards 5-9s.

Even so, the heterogeneity of the network is here to stay due to cost, topology, and governance reasons, to name a few, and this heterogeneity will always imply that the fallacies remain true. So how can the network evolve to make distributed cloud native applications easier to develop, deploy and manage?

To understand that, let’s take a quick look at the transformation the network has been through in the last three decades, the problems it has been trying to solve so far, and how the problem space is expanding in the coming decade.

The Dawn

In the beginning, there was the box.

All infrastructure providers started as box builders: compute, storage, and networking. In the networking infrastructure arena, these boxes were tightly coupled hardware and software systems. We started connecting these systems to build larger and more capable networks and created well-defined protocols for these boxes to communicate with each other. As the complexity of the networks grew, so did the complexity of features and protocols embedded within these networking devices.

We tried stretching this complexity globally, creating building-sized data centers and connecting thousands of these data centers. These global networks became slow to operate, slow to monitor and slower to evolve.

Growing up for Planet-Scale Computing

The first wave of simplification started with Software Defined Networking (SDN), which conceptually, tried to make network resources fungible in a way that makes it easier for software to interact with it.

With the introduction of Intent-Based Networking (IBN), we started treating entire networks as singular systems, bringing in concepts such as declarative topology, configuration and policy management, real-time visibility via streaming telemetry, and artificial intelligence (AI) based closed-loop operations, helping us move towards a zero-touch network. These advances enabled incredible scale and planet-scale computing.

The Virtualization Transition

About a decade ago, the transition to virtualization started taking place in the applications and compute space. There was a promise, and partial delivery, of better lifecycle management and agility through this transition. But in the race to get this transition complete, we all took some shortcuts. The mantra of the day was: Just Wrap It.

Large monolithic applications were wrapped in large monolithic virtual machines (VMs) and deployed. Organizationally, this was perfect. The application architectures didn’t change — a database administrator remained a database administrator. Therefore, the operational architectures and processes remained unchanged, which implied that the organization remained unchanged.

Similar shortcuts were taken in the networking space. We wrapped large monolithic networking functions into VMs. We used the same concepts of (v)Switches and v(Routers), and the same mechanisms and protocols, such as V*LAN, subnets, routing (BGP) to replicate everything from the physical networks into the virtual space.

And that was okay, since both virtual networks and physical networks are statistical, aggregate networks, carrying application traffic from any to any. They are really one and the same.

Physical and virtual networks are both trying to solve the same problem space.

The Massive Transformation to Cloud Native

As cloud native architectures started becoming pervasive because of the benefits outlined at the beginning of the article, all new applications started their life as microservices or serverless based, along with the disaggregation of older monolithic applications towards cloud native architectures.

The operational architectures and processes couldn’t sit still. A database administration operational function now explodes into operational teams for sets of microservices and functions that create the database. Therefore, the skill sets needed within an organization are also changing dramatically, changing an organization’s core structure.

Developers and cloud architects started selecting appropriate blends of on-premises software stacks and various public cloud APIs to deliver their cloud native applications and started driving infrastructure decisions.

Re-Imagining the Cloud Native Network

Unfortunately, the change doesn’t stop here. As the application’s components became thinner and thinner (microservices, functions), and geographically diverse (cloud regions, on-premises, across the globe), the connectivity problem for even a single application became much, much worse. A quick look at the service dependency graph of a cloud native application (e.g., the Monzo banking app) will give us a sense of the networking problem needing to be solved.

Additionally, this challenge is at a different plane compared to the physical and virtual networks. This network, or the cloud native network, is for the application developer. It has narrow and deep context and is less worried about all the rest of the traffic flowing through the network below. It follows the principles of simplified connectivity, relevant context, and follows the same activation models that are used in application development.

It’s time to rethink what connectivity, not networking, looks like to the application developer. It’s time for cloud native networking.

Image by enriquelopezgarre from Pixabay.

A newsletter digest of the week’s most important stories & analyses.