When Web scale isn’t Enough: Containers in the Telecommunications Space

The Docker container was a solution to a scale problem: Web services workloads, especially microservices, could not scale up in the traditional manner. Containers provide a simpler, commodifiable solution, involving isolating the functions in those workloads, scaling those isolated units, and often re-organizing the underlying network to suit the new scale.
Yet there are a set of systems with such large workloads where containers still must be customized to accommodate the demand, that of Internet-based telecommunications.
Call to Action
Based in Huntsville, Alabama, Digium produces software-based telecommunications systems, such as VoIP exchanges and unified communications (UC) consoles, based on an open source Session Initiation Platform (SIP) platform called Asterisk. In much the same way Docker Inc. has become the commercial champion of the now-standardized containerization platform, Digium is the acknowledged champion of Asterisk.
At an Asterisk user’s convention last October called AstriCon, two engineers for a call center platform provider called Avoxi — one of Digium’s main partners—introduced developers, some for the first time, to Docker. Engineering Product Development Director Leif Madsen and CTO Darren Sessions [pictured above, left to right] presented the container platform in the context of other open source initiatives we also haven’t talked much about here — names like Kamailio, Calico, and Sippy.
These are the elements of a new stack in the telecommunications world, a stack that is remaking the telecom industry and telcos the way Docker has remade Web services. But it’s a different stack, and at AstriCon, Sessions and Madsen explained why using the kind of graph that the folks who created Hadoop first showed the world when explaining the conundrum with which relational databases presented them.
Docker doesn’t scale… at least, not the way a containerization platform would need to scale if it were to handle VoIP traffic. And in a manner reminiscent of those Hadoop folks who changed the world a decade ago, Sessions and Madsen are working on a solution.
“When we talk about all these new and exciting technologies, we have to keep in mind that these technologies were built for Web scale,” said Sessions very plainly. By that, he didn’t mean “Web scale” like the marketers tend to project it as if the Web were the biggest thing ever, bigger than a Lockheed-Martin C5M, bigger than Donald Trump’s ego, bigger than Scott Fulton’s pile of handwritten notes.
He meant Web scale is “way too small.”
“It just means that these projects were designed to scale Web services specifically,” the CTO explained. “When we talk about using applications outside of that realm, the dynamics of how these Web-scaling technologies are configured to support these non-Web-based applications, changes dramatically.”
The Web, you see, is designed to scale things with a common context. A Uniform Resource Locator was designed to point a route through a server to something the client urgently needs. Hypertext was originally supposed to be a fabric threaded with textual, lexical contexts for a Web made of ideas.
When it transformed to a Web of functions, URLs still made sense because functions in applications still share common contexts. But context is a dimension that does not apply to traffic.
The flow of traffic is a different beast. And while a Docker container appears to be a superb vehicle for a unit of traffic, the network upon which Docker containers are exchanged, simply is not.
“It is virtually impossible to use Asterisk with any of these platforms out-of-the-box in a production environment,” said Sessions, “without a significant amount of work.”
“When we talk about scaling and Docker containers,” he continued, “networking is also a critically important consideration. Networking containers isn’t as easy as you think. The problem Docker was initially conceived to solve was a problem with scaling Web services. And to that end, Docker, CoreOS, and a lot of these really cool, Web-scale projects present a lot of bias to solving Web-scale problems. The obvious problem for us is that we’re VoIP users.”
Another Dimension
Since the turn of the century, there has been an open source SIP router project. Through the years, intellectual property rights holders have compelled these projects’ stewards to change their names. Today, two major branches of the original project are OpenSIPS and Kamailio (a Hawaiian word, pronounced “ka-ma-ee-li-o”). Asterisk, meanwhile, is a telephony engine, whose workload unit is not a function or a record but a call.
More specifically, in telephony, a call is considered a coordinated session of signals. Kamailio is designed to process signals in such a way that it abstracts the complexity of the IP network from the client, presenting instead the appearance of processing signals in the way that telephony experts would expect.
When the notion of an Internet-based (IP-based) network handling telephone calls was first proposed, those who objected cited the real-time nature of telephony, and noted that IP packets were designed to be received out-of-sequence and then reassembled—a process they said would take too long. Internet speeds accelerated to such an extent in the intervening time that latency dwindled down to a negligible factor.
But that was before SIP was considered as a medium for delivering conferencing traffic. If you understand the nature of a telephone call, you know that it’s two-way. On a wired circuit, a conference is handled by crossing several two-way sessions into a hub. For SIP sessions, you need a conferencing server. The introduction of such a server multiplies the complexity of IP-based connections because each party in a session negotiates the parameters for the connection with the other party before traffic is actually exchanged.
Kamailio is a SIP server designed to perform the major roles of a facilitator of IP-based calls. Here is where you begin to have some sympathy for the weight of the job. In the circuit-switched telephone days, location was a fixed thing. When someone placed a call, the system knew which telephone was doing the placing. Internet Protocol is explicitly designed not to care about such a thing when forwarding packets from place to place.
So SIP has to set up a proxy server to pretend to represent the client even when it doesn’t really care about the client’s situation (sort of like a divorce attorney). It handles the exchange of traffic, and then filters the appropriate feedback through to the client, giving that client the illusion of communication.
Multiply that process by at least a dozen orders of magnitude, and you get an idea of what the Internet now has to deal with when handling voice traffic.
Re-Scale
Avoxi’s solution to the problem of scaling Asterisk traffic is twofold. As call traffic increases, it can’t simply multiply the number of containers that include Asterisk servers — not without exponentially increasing the level of SIP proxies handling all the simultaneous connections.
So Madsen, Sessions, and their team created another proxy layer, to exclusively manage the Asterisk APIs (called ARI) that represents the call-handling logic. A message bus can be situated in front of the ARI layer to coordinate the calls from any number of caches, and it is these caches that can scale up as traffic scales up.
This does not necessarily mean the number of Asterisk servers must scale up, however. Its scalability may be determined by the flow of procedure calls between the ARI proxy layer and the servers.
With standard Docker-oriented networking, each container has its own IP port on the host network, usually teamed together with some type of proxy or load balancer. Typical network address translation (NAT) renders it infeasible for containers to represent call handling on the same network, at scale.
So within the Asterisk layer, Avoxi replaced the typical Docker networking scheme with CoreOS’ Flannel. This creates a kind of virtual network in which each container is assigned an exclusive IP address on a /24 subnet, as opposed to a port number. It then applies CoreOS’ Tectonic Calico as a virtual infrastructure platform, so that the container environment can be compatible with Kubernetes while at the same time integrating the container network with the cloud-based network without the need for yet another overlay.
With so many Asterisk containers in such an environment, how does the system keep track of where all the SIP endpoints are located, asked one AstriCon participant?
Kamailio, explained Avoxi’s Madsen, acts as the distribution point with which all the SIP clients make contact. “Then your container network is really hidden from the outside world. There’s really no knowledge of that container network being outside, so your distribution of the calls from the proxy then gets distributed amongst your internal network. And that’s why you need either an overlay network or something like Calico, so that you’re moving that NAT aspect, because, by default, Docker’s going to try to utilize NAT for the containers—which works great if you’re running Apache or NGINX, but with Asterisk, you want to avoid that.”
The back-end network—the one with all the Asterisk servers — can remain a traditional container network, said Madsen. Kamailio then treats the endpoints on the other side as though it were a traditionally virtualized network.
Avoxi’s engineers realized that call handlers, and the servers managing the call handlers, must be scaled separately, and the network between them must be interpolated uniquely. But just as was the case with big data, they could only come to this realization when the immensity of the workloads with which they were faced, ballooned out of proportion with their perspectives of the job.
They redefined the job, and in so doing, changed their notion of scale and dimension.