Meet Scale and SLA Demands in Real-Time Data Architectures
Companies have gathered years of data into an incredible store of information, with new data streaming in at skyrocketing rates.
But in the race to put all this data to work in the explosion of artificial intelligence, machine learning and real-time applications, many organizations hit two significant hurdles: operating at fluctuating and ever-growing scale, and service delivery service-level agreements (SLAs) that demand faster performance for instant, real-time decision-making.
It’s easy to build something that works well for a few people or a small amount of data. But ensuring operation at high scale is now table stakes. There’s just no slow roll on scale. Every company is racing to build applications and deliver real-time transactions and exceptional user experiences to millions of users as quickly as possible. And with that come demanding SLAs for performance, uptime and service delivery.
The world is impatient for information and answers. Any lag or downtime now has an immediate impact on operational profit, revenue generation and customer satisfaction. The risks are high.
Let’s look at the various levers that are required for delivering high throughput at scale for demanding real-time applications, including massive parallelism, indexing and interoperability.
Having many levels of parallelism is a fundamental tenet of a modern data architecture at scale. On the platform side, you need a divide-and-conquer strategy by dividing the data space into multiple partitions that can be independently processed. Use separate database clusters for edge and core systems. And have it all operating independently and yet aligned with one another.
This means using multiple nodes in a cluster operating as an integrated distributed system, using multiple, concurrently accessible drives within a cluster node to deliver parallel execution threads both within a node and across multiple nodes in the cluster.
Beyond the data-processing side of your data architecture, you also need parallel architecture in query layers and streaming pipeline layers. Furthermore, in addition to scale-out across multiple nodes and multiple drives within nodes, it is also important to scale up on a single node by using multiple threads with CPU and NUMA (non-uniform memory access) pinning to take full advantage of modern processors.
In distributed systems processing large amounts of data, it can be a struggle to have enough network bandwidth for all the data to go back and forth between the edge, the core, all your clouds and the inherently distributed nature of today’s data deployments. Scanning large amounts of data can take large amounts of time and can negatively affect performance when queries scan every document or entry in a table, which eats bandwidth and compute resources.
To solve this problem, companies often spend lavishly on more capacity and compute. But in reality, the answer to better bandwidth and compute performance lies in combining massively parallel query processing with secondary indexing, which reduces or eliminates the amount of data that needs to be moved between system components.
Secondary indexing is a data structure that locates all the records in a namespace, or a set within it, based on a field value in the record. When any indexed value within a record is updated, the secondary index is automatically (and atomically) updated.
Secondary indexes can take a while to build the first time, but take little additional effort to keep records updated, thus enabling any resulting queries using the secondary index to become faster — in some cases by several orders of magnitude.
You can further speed up the performance of secondary indexes with specific configurations, including some present in the Aerospike platform. Secondary indexes can be stored in dynamic random access memory (DRAM) for fast lookups. They are co-located with the primary index entries on every node in the cluster, thus enabling a secondary index query to execute in parallel on all nodes at the same time.
Each secondary index contains only references to records local to the node, allowing for storage and lookup optimizations enabling efficient storage utilization as well as extremely fast lookup and update times.
As data “democratizes” to more people, partners and applications, and increasingly needs to meet real-time demands, all the old, centralized technology stacks are shifting to new models that increasingly demand more speed and stability at scale.
The most successful digital transformation efforts use an augmentation strategy where they retain the old systems for certain applications, but introduce new systems like Aerospike for new applications. This way many of the compliance and reporting mechanisms (decades old) can continue unaffected, while the organization uses the new systems for speeding up the building of new apps that are critical to improving overall user experience and staying competitive in a fast-changing world.
Most of what we’ve examined is at the data platform (or network) level. But data platforms aren’t an island. They operate among a new stack of real-time software that includes everything from intake (streaming using Kafka to AI/ML processing using Spark) and querying using a range of SQL (such as Presto) and other standards for data access.
As you build to scale, you’ll want to understand how these critical related systems handle performance at scale.
Companies face ever-greater pressures for more applications in the hands of users. It’s tempting — and common — to build to the minimum viable product, but I hope you’ll start building systems for handling high scale from the very beginning and avoid the pain and penalties of re-platforming and missing SLAs due to demand spikes.