DevOps Demands NetOps
Building applications that run on your own servers and networks used to mean some very firm demarcations between what was considered a developer task and what was left up to the network team. The move to cloud platforms and cloud abstractions (even on your own hardware) has changed where those divisions fall — just as network operations teams have been adopting new ways of working, in part to keep up with new demands from developers used to being able to control the environment of their application without having to wait for access to resources, whether that’s VMs, storage or connectivity.
In the cloud, you don’t touch a physical network switch because clouds like Amazon Web Services, Microsoft Azure and the Google Cloud Platform don’t let developers anywhere near that level of the hardware. But developers set up all the networking for their applications, managing IP ranges and creating virtual networks and performing networking tasks that they want to automate and operationalize.
“More and more, cloud native developers are having to grapple with networks in a way they hadn’t before,” explains Nigel Kersten, vice president of ecosystem engineering at Puppet. “Previously I had a port for my services and I set that up, but everything else was someone else’s responsibility. Setting up VPCs and auto-scaling and doing failover was traditionally in the space of the network operations team; now developers regularly do those, but they don’t think of it as network work.”
In fact, he notes, thanks to cloud platforms “all modern developers are turning into distributed systems developers and building distributed systems has always required understanding failure modes and network partitions.”
For cloud providers, that abstraction is a deliberate choice. “What customers see are the virtual network abstractions; the APIs and the language bindings and the portal experience,” Yousef Khalidi the corporate vice president of Azure Networking explains.
“More and more, cloud native developers are having to grapple with networks in a way they hadn’t before” — Nigel Kersten, Puppet.
“We have done work in the network to enable that abstraction. We use that are switches very dumb and fast and we put all the intelligence in the host; that enables us to keep the network simple, reliable and highly redundant and the virtual network abstraction can do the rest. We don’t use the physical network to implement virtual network abstractions: there’s no physical network ACLs or specific routing in the switches themselves; it’s all kept in the VNET level by design. The way the physical network gets into the picture is by as much as possible being out of the picture. I would say we have failed if the customers need to know much of anything about the physical network.”
These abstractions work both for developers and for network admins who are integrating their own network with the cloud, Andrius Benokraitis, principal product manager for Networking at Red Hat, told The New Stack.
“In the cloud; servers, network, and storage are all extremely interconnected. The nearly invisible boundaries have given rise to the cloud administrator, one in most cases is responsible and accountable to ensure all areas of the workloads are provisioned, configured, and sized appropriately for critical workloads. Cloud administrators, typically trained in the server or storage world previously, have adopted tools that can span typically isolated physical or virtual technology silos,” Benokraitis said.
Even inside enterprises, DevOps has helped to push network operations teams to adopt more integrated ways of working just to keep up.
DevOps Demands NetOps
The shift to software-defined networking in the enterprise as part of private cloud is only just beginning and is often tied to the deployment of new data centers (or to refitting existing facilities). Combined with standards for connecting to and managing existing physical network devices (SSH, NETCONF, and open APIs) and the availability of virtual instances of the network operating systems, this creates the environment for netops, Benokraitis says.
“With these, network professionals can now simulate and test changes in a virtual environment prior to making changes to production infrastructure. Prior to this testing was often done in production, leading to many more outages and downtime.” That’s key to managing networks to keep up with both business and technical demands. Because it’s mission critical, networking has been slow and dealt with manually, which has made it fragile; that has to change.
“When IT organizations adopted agile and DevOps practices on the server and application deployment side, the pressure for the network to be as flexible became much more apparent,” Benokraitis points out. “Enterprise network engineers are under extreme pressure from the customers they serve to provide services in an on-demand basis due to its dependency on the rest of the organization. For example, waiting three to six weeks for a new VLAN when the application development team needs it immediately now brings new challenges to IT leaders.”
The move to Kubernetes and microservices increases the pressure on developers to work with networking abstractions. Networking is harder when the environment isn’t static, whether that’s autoscaling to meet demand or restarting services to deal with failures, but service meshes and proxies like Istio and Envoy and the new Network Service Mesh project simplify development by providing abstractions and building blocks. “The idea is to encapsulate a bunch of these hard problems like load balancing, HTTP2 GRPC and so one, to do that in one place and allow that work alongside any application language,” explains Lyft’s Matt Klein, the author of Envoy.
Decoupling the data and control planes allows developers to offload networking concerns like encryption and authorization with centralized policy management. “Developers don’t have to build encryption into every application,” said Thomas Graf, longtime Linus kernel developer and the founder of the Cilium project; “the biggest element service mesh brings is to allow say cluster-wide encryption of all traffic without having to dedicate a developer to that.”
That’s very different from thinking about IP addresses and physical network concepts, because the network is being used differently from when you’re building a monolithic application. “No-one wants to see IP addresses any more; they want to see service names and pod labels and have persistent naming of services across different clusters,” Graf suggests. “The network is becoming an app messaging bus.”
These kinds of fundamental changes will need the same kind of cultural changes that DevOps did. Teams will need to developers and networking experts, with an equally hybrid mix of skills. Network admins need to learn scripting and get comfortable with tools that help them deal with declarative states using rules and filters, which is a big cultural shift from the step-by-step procedural runbook approach they’re used to, Kersten notes. “It requires a separation of concerns and the organization needs to move to a state of constant auditing of the network. I need to be able to pull the attributes of every VPC and analyze what’s in out of compliance. The best approach is one where you set guidelines for developers [around networking] then trust but verify.”