Triangulating Your Data for Intelligent Global Traffic Management
Triangulation, a method for determining a location by forming triangles to it from known points, is a central concept in navigation, surveying, 3D optics, astronomy, and more. In an abstract sense, it is also one of the primary concepts underlying the promise of big data analytics. By pulling together data streams from multiple vectors and using algorithms to collate, sift, and mine the data, we can uncover fresh insights, hidden relationships, and recurring patterns — all of which can be analyzed to develop predictive intelligence.
When it comes to developing comprehensive network intelligence, it’s essential to synthesize data from multiple perspectives in order to create intelligent routing methods. It is not sufficient to know that systems are available — we need to know what the end user is experiencing. But even these two perspectives do not give us the whole picture. For example, how do you know what resources are being used to provide that high-quality experience? Are they near capacity? How much are they costing you? How do you know if CapEx resources are being used optimally, or if your providers’ performance meets SLA standards?
Given the complexities of hybrid IT and the nonstop growth of Internet traffic, it is now impossible to monitor and control for all potential complications (not to mention optimizations) without automated intelligence. Especially when it comes to multicloud app and media delivery, that intelligence should be based on a triad of data perspectives.
Real-time systems health checks are the first point of the data triad for intelligent traffic routing. And they begin with accurate, low-latency, geographically-dispersed synthetic monitoring, which reliably, and in real-time, answers the question: is the server up and available? Avoiding high-latency options that can take precious minutes to notice a system-down situation takes one of the biggest headaches for any DevOps team off the table.
“On/Off” confidence, however, is necessary but insufficient: to effectively route traffic, one must know the current health of those servers that are available. And where local load balancers (LLBs) and Application Delivery Controllers (ADCs) were able to handle incoming requests, modern infrastructure and delivery require smarter, more distributed and non-proprietary solutions.
When LLBs are not dynamically and centrally controlled, it can take an unacceptable amount of time to switch over when a cluster goes down. Put another way: a system that is working fine may be approaching resource limits, and a simple On/Off measurement won’t know this. Without this key piece of information, a system can cause so much traffic to flow to this near-capacity resource that it goes down — potentially setting off a domino effect as traffic floods other working resources.
Global traffic management executed by customizable algorithms and real-time decision intelligence is an essential protection against subpar performance
LLBs running without real-time intelligence, then, are susceptible to slowdowns, micro-outages, and cascading failures, especially if hit with a DDOS attack or unexpected surge. And indeed, there are times when it’s necessary to make changes to your standard resource model: updates, repairs, natural disasters, and app or service launches.
Without scriptable load balancing, you have to dedicate significant time to shifting resources around — and problems mount quickly if someone takes down a resource but forgets to make the proper notifications and preparations ahead of time. Dynamic global load balancers (GLBs) use real-time system health checks to detect potential traffic or resource problems, route around them, and send an alert before failure occurs so that you can address the root cause before it becomes a fire drill.
The second point of the data triad is Real User Measurements (RUM), which provide information about Internet performance at every step between the client and the clouds, data centers, or CDNs hosting your application. This data should be crowd-sourced by collecting metrics from thousands of Autonomous System Numbers (i.e., ISP networks), delivering billions of RUM data points each day. This kind of traffic intelligence can’t be gathered from your own system (unless you’re Google-sized). Even if you have millions of users each day, you only have decently deep measurements from a few hundred ASNs. Community-sourced intelligence is necessary to see what’s really going on in the far-flung reaches of your growing application universe.
Community intelligence is just as important for monitoring the experience of big, messy pools of users as it is for the mysterious pockets of users on the edges of your network. Many countries have thousands of ISPs (e.g., Brazil, Russia, Canada, Australia). Most likely, these areas are important to your global delivery needs and business success. Excellent user experience data is particularly important where there are so many individual peering agreements and technical relationships, representing myriad causes for variable performance.
Combined with Server Health intelligence, RUM intelligence ensures we route traffic to servers that are up and running, not about to fall over, and are demonstrably providing great service to end users.
Which brings us to the third point of the data triad. What more could there be if you are able to dynamically control systems and user experience from global to granular levels? As long as everything is up and running and users are happy, what more is there to worry about?
Quite a bit, actually. As in, quite a bit of money. Along with systems and user experience, optimizing spend is fundamental to business outcomes. Cloud overflow expenses can mount quickly. If you can’t feed cost and resource usage data into your global load balancer and automated application delivery, you won’t get traffic routing decisions that are as good for the bottom line as they are for QoE.
DevOps is increasingly responsible for business decisions in areas like cost control, product lifecycle optimization, resource planning, responsible energy use, and cloud vendor management. It’s time to put all your Big Data streams (e.g., software platforms, APM, NGINX, cloud monitoring, SLAs, and CDN APIs) to work producing stronger business results. By combining third-party data with real-time systems and user measurements, you can define your application delivery rules to prioritize datacenter utilization before expensive cloud bursting or to track green energy usage by your cloud, hosting, and CDN providers.
Every company has its own contextual business and performance priorities. Automating app delivery from the cloud or data center that makes the most sense (based on user experience, cloud cost/SLA, APM data, and more) is especially vital for the delivery of modern applications in a multi-cloud world. The added control layer provides the comprehensive visibility and application delivery control required to achieve cloud agility, performance, and scale while staying in line with business objectives and budget constraints.
We’re past the point where big data-driven intelligence only exists in the realm of bleeding-edge experiments or marketing demographic research. Global traffic management executed by customizable algorithms and real-time decision intelligence is an essential protection against subpar performance — and business disaster. All types of digital businesses are under pressure to avoid the service attrition of micro-outages, the embarrassment, and loss of major outages, and the wrath of audiences ambushed by video streaming failures. Moreover, there are budget constraints, regulatory compliance, and talent shortages to contend with. By harnessing the triangulated intelligence of server, user experience, and business health data on a global load balancing platform, digital companies can continuously monitor and optimize the services at the heart of their enterprise.
Feature image by Sarah Ann Loreth via Unsplash.