5 Key Questions for App-Driven Analytics
Data that powers applications and data that powers analytics typically live in separate domains in the data estate. This separation is mainly due to the fact that they serve different strategic purposes for an organization.
Applications are used for engaging with customers, while analytics are for insight. The two classes of workloads have different requirements — such as read and write access patterns, concurrency and latency — therefore, organizations typically deploy purpose-built databases and duplicate data between them to satisfy the unique requirements of each use case.
As distinct as these systems are, they’re also highly interdependent in today’s digital economy. Application data is fed into analytics platforms, where it’s combined and enriched with other operational and historical data, supplemented with business intelligence (BI), machine learning (ML) and predictive analytics, and sometimes fed back to applications to deliver richer experiences.
Picture, for example, an e-commerce system that segments users by demographic data and past purchases and then serves relevant recommendations when they next visit the website.
The process of moving data between the two types of systems is here to stay. But today, that’s not enough. The current digital economy, with its seamless user experiences that customers have come to expect, requires that applications also become smarter, autonomously taking intelligent actions in real time on our behalf.
Along with smarter apps, businesses want insights faster so they know what is happening “in the moment.”
To meet these demands, we can no longer rely on only copying data out of our operational systems into centralized analytics stores. Moving data takes time and creates too much separation between application events and analytical actions.
Instead, analytics processing must “shift left” to the data source — to the applications themselves. We call this shift application-driven analytics. And it’s a shift that both developers and analytics teams need to be ready to embrace.
Defining Required Capabilities
Embracing the shift is one thing; having the capabilities to implement it is another. In this article, we will break down the capabilities required to implement application-driven analytics into the following five critical questions for developers:
- How do developers access the tools they need to build sophisticated analytics queries directly into their application code?
- How do developers make sense of voluminous streams of time series data?
- How do developers create intelligent applications that automatically react to events in real time?
- How do developers combine live application data in hot database storage with aged data in cooler cloud storage to make predictions?
- How can developers bring analytics into applications without compromising performance?
1. How do developers access the tools they need to build sophisticated analytics queries directly into their application code?
To unlock the latent power of application data that exists across the data estate, developers rely on the ability to perform CRUD (create, read, update and delete) operations, sophisticated aggregations and data transformations.
The primary tool for delivering these capabilities is an API that allows them to query data any way they need, from simple lookups to building more sophisticated data-processing pipelines. Developers need that API implemented as an extension of their preferred programming language to remain “in the zone” as they work through problems in a flow state.
Alongside a powerful API, developers need a versatile query engine and indexing that returns results in the most efficient way possible. Without indexing, the database engine needs to go through each record to find a match. With indexing, the database can find relevant results faster and with less overhead.
Once developers start interacting with the database systematically, they will need tools that can give them visibility into query performance so they can tune and optimize. Examples include monitoring tools that provide real-time server and database metrics, identifying performance issues and making recommendations, such as index and schema suggestions to further streamline database queries.
2. How do developers make sense of voluminous streams of time series data?
Time series data is typical in many modern applications. Internet of Things (IoT) sensor data, financial trades, clickstreams and logs enable businesses to surface valuable insights. Developers need the ability to query and analyze this data across rolling time windows while filling any gaps in incoming data. They also need a way to visualize this data in real time to understand complex trends.
Another key requirement is a mechanism that automates the management of the time series data life cycle. As data ages, it should be moved out of hot storage to avoid congestion on live systems; however, there is still value in that data, especially in aggregated form, to provide historical analysis.
So, organizations need a systematic way of tiering that data into low-cost object storage to maintain their ability to access and query that data for the insights it can surface.
3. How do developers create intelligent applications that automatically react to events in real time?
Modern applications must be able to continuously analyze data in real time as they react to live events. Applications need to be able to access real-time data changes and then automatically execute application code in response to the event, allowing developers to build reactive, real-time, in-app analytics.
Dynamic pricing in a ride-hailing service, recalculating delivery times in a logistics app due to changing traffic conditions, triggering a service call when a factory machine component starts to fail or initiating a trade when stock markets move — these are just a few examples of in-app analytics that require continuous, real-time data analysis.
4. How do developers combine live application data in hot database storage with aged data in cooler cloud storage to make predictions?
Data is increasingly distributed across different applications, microservices and even cloud providers. Some of that data consist of newly ingested time series measurements or orders made in your e-commerce store and reside in hot database storage. Other data sets consist of older data that might be archived in lower-cost object cloud storage.
Organizations must be able to query, blend and analyze fresh data coming in from microservices and IoT devices along with cooler data, APIs and third-party data sources that reside in object stores in ways not possible with regular databases.
The ability to bring all key data assets together is critical for understanding trends and making predictions, whether that’s handled by a human or as part of a machine learning process.
5. How can developers bring analytics into applications without compromising performance?
Live, customer-facing applications need to serve many concurrent users while ensuring low, predictable latency, and they need to do it consistently at scale. Any slowdown degrades customer experience and drives customers toward competitors. In one frequently cited study, Amazon found that just 100 milliseconds of extra load time cost them 1% in sales. So, it’s critical that analytics queries on live data don’t affect app performance.
A distributed architecture can help enforce isolation between the transactional and analytical sides of an application within a single database cluster. You can also use sophisticated replication techniques to move data to systems that are totally isolated but look like a single system to the app.
The Bridge to App-Driven Analytics
As application-driven analytics becomes pervasive, a developer data platform is necessary to unify the core data services needed to make smarter apps and improve business visibility.
A developer data platform bridges the traditional divide between transactional and analytical workloads in an elegant and integrated data architecture, acting as a single platform managing a common data set for both developers and analysts. It minimizes data movement and duplication, eliminating data silos, reducing architectural complexity and unlocking analytics faster on live operational data. The final, critical requirement is that it does all this while meeting the most demanding needs for resilience, scale and data privacy.
To learn more, read Application-Driven Analytics: Defining the Next Wave of Modern Apps.