Deliver Search and Analytics at the Speed of Transactions
Disruptive forces are changing the speed of business. The combination of a data-driven culture, customer expectations and digital disruptors is forcing businesses to move fast and be more agile or risk losing customers. It’s not enough to play catch-up. Companies need to overtake digital disruptors.
At the same time, digitization, the Internet of Things, social media and mobile devices are creating a flood of data in real time. And it’s just getting bigger. IDC estimates that there will be 175 zettabytes of data by 2025, and that nearly 30% of all data created will be real-time (compared to 15% in 2017).
Unfortunately, much of this data is locked in slow, disk-based databases that don’t support real-time, consumer-grade experiences. Their transactional systems are databases best suited for order entry, financial transactions, CRM and retail sales, but aren’t performant for search and analytics. Up until now, data warehouses have been the go-to solution, but they tend to be too slow because they need new data to be prepared, loaded and analyzed in batches. Too much time and money is sunk into data modeling and copying data around. During that time, the questions you need your data to answer will have changed, and the data and ETL won’t support the new questions. You need an integrated view that can break down these silos and include consumer-generated and mobile data at the same time.
Build a New Data Stack for Search and Analytics
Today, technologies to capture streaming data have made it easy to capture and store event data, but processing is a whole different challenge. Rather than traditional warehouses, streaming data can be better processed by modern search and analytics systems.
Most enterprises have hundreds of applications, and most of them are sources of critical business and customer data. To support performance and scalability, these transactional systems are separate from analytical systems. You have to choose between slow delivery of timely analytics and operational insights, or upgrade to an entirely new translytical/hybrid transactions analytics (HTAP) stack. However, can you add operational analytics without ripping away your existing data infrastructure? By reimagining your data stack using principles of caching, it is now possible to undertake a low- or non-disruptive modernization approach that keeps your data in separate systems, thanks to in-memory architecture.
One way to approach this is by leveraging a product like our own RediSearch, which transforms Redis into a powerful in-memory search and analytics platform. It allows you to quickly create indexes on data and uses an incremental indexing approach for rapid index creation and deletion. The indexes let you query your data at lightning speed, perform complex aggregations and filter by properties, numeric ranges and geographical distance.
RediSearch supports full-text indexing and stemming-based query expansion in multiple languages. It provides a rich query language that can perform text searches as well as complex structured queries. Furthermore, you can enrich search experiences by implementing auto-complete suggestions using “fuzzy” searches.
Most importantly for the purposes of analytics, these capabilities enable RediSearch to support real-time responses to business moments. This can allow a business using this architecture to maintain an up-to-date picture of customer behaviors, competitors’ prices, shipments or other fast-changing data so it can be analyzed in real time. The key to all these scenarios is that data isn’t being used for insights after the fact. Instead, new data is processed immediately and consumed by live applications to take actions.
The New Real-Time Architecture
To take advantage of this architecture, developers can start with a small set of transactional data sources that can be used for key business insights. You can maintain consistency of the search and analytics layer using write-through or write-behind strategies. Additionally, use eviction strategies like least-recently-used (LRU), least-frequently-used (LFU) or time-to-live (TTL), to keep size of data under control. This decoupling of search and analytics from the operational database not only offloads the production system, but also allows you to scale out and distribute search indexes to radically improve performance.
The above approach is an ideal solution to be used on real-time and near-real-time data for time-sensitive operational decisions by searching both structured and unstructured data (full-text search) to uncover non-obvious insights. However, its high-performance search and analytics can also improve the quality of non-real-time tactical and customer-experience scenarios. Key use cases include:
- Real-time apps where every second counts, like stock trading, fraud detection or patient health monitoring.
- Mobile apps where interactions need to be fast. Mobile applications are demanding data from multiple stacks in real time to deliver a 360-degree view of the customer, product, employee or business.
- Microservices and çonnected data apps where integrated business data is critical. ETL fails to deliver real-time changes, but this architecture can overcome this by delivering a real-time, trusted view of critical business data, ensuring that the source of information is accurate.
In industries such as telecom, financial services, retail and healthcare, organizations are experiencing explosive growth in the workload they must sustain due to the general trend toward digitalization, mobile apps, the IoT and other factors. The extremely competitive nature of these businesses requires support for ultra-high-end requirements on top of the low-cost, commodity-hardware-based settings at the core of hyperscale computing.
Application leaders involved in global-class digital business initiatives should strategically adopt the hyperscale architecture for their large-scale, high-performance and high-availability applications for public and private cloud deployments.
Businesses today are a series of real-time events. But what separates the good from the great is how they capture and operationalize that data. Good businesses use data to make informed decisions over time. Great businesses operationalize data to automatically take actions in real time.