Data

Isima Takes on Data Management Ingest to Insight

10 Aug 2020 9:31am, by

Over the past couple of decades, data management has grown ever more complex, but the available “solutions” haven’t kept up, according to Isima, which just launched bi(OS), an end-to-end platform to help organizations quickly ingest and process data for real-time insights.

Companies that have effectively harnessed data in real-time have been vendors such as Google and Facebook with deep pockets and plenty of staff. That’s only a dream for most organizations, according to Darshan Rawal, CEO and co-founder of Isima, who maintains that the practice of cobbling together an array of technologies — RabbitMQ, Kafka, Spark, Cloudera, Hadoop, SAS, Informatica — isn’t getting companies closer to making that dream a reality.

In a blog post, co-founder Monish Suvarna explained what they saw in the market:

Some of the complexity came from integrating many different good products instead of thinking end to end. We saw layer after layer of point solutions that solved one problem well, but made the complete solution a lot more complicated. Each point solution was easy to try and use, but the entire system deployed at scale needed an army. The end result of this complexity was that Enterprises had no real ROI to show after years of huge investments in Big Data.

In contrast, PharmEasy, an ePharmacy in India, put one analyst on employing bi(OS) to make improvements in supply chain operations. Its experiments included using artificial intelligence/machine learning to determine customers whose orders should move up in priority. It reported that observability on that front was reduced from six hours to 15 minutes.

Another issue was determining the status of orders, a process that was producing KPIs for improvement, but in four to six hours — too late for meaningful remediation. Using bi(OS) enabled the team to combine multiple data sources to drill down by city, state, SKU, etc., to look for abnormal delays for specific orders in real-time.

Its warehouse sent out orders initially based solely on expected delivery times, while bi(OS) enabled further customer segmentation, a feature it rolled out for 30 million customers in one week. Then in days, it wrote logic in days to reprioritize orders every 30 minutes.

Rethinking Everything

The Isima team has deep experience in big data since before that phrase was coined and before the days of Hadoop. Its core team has been building scale-out relational databases, high-performance storage and applied AI for companies including Microsoft, AWS, Cloudera, Tibco, BlueArc, Drobo, DataStax and D. E. Shaw.

Vice president of engineering Pradeep Madhavarapu was lead developer for Microsoft’s SQL Server engine and among the creators of the Amazon Aurora database. Rawal built a Cassandra clone in the early 2000s, years before Cassandra was open sourced (2008). He and Alfredo Tamura, vice president of strategic sales, worked with telcos in Japan to ensure messaging remained reliable in the event of a tsunami. Among his experience, Suvarna spent a decade with private equity company Intellectual Ventures, raising a fund of $650 million for early-stage technologies in software, medical IT and materials.

The Palo Alto, California-based team built an all-in-one platform covering ingestion, storage, processing, visualization, and utilization of data. They built the SQL relational database for bi(OS) in-house. It’s tightly integrated with system’s data ingest and analytics capabilities. They added a data catalog to help users identify interesting data sets and to help with data governance.

Rawal has described it as a data plane, rather than a pipeline, combined with AI and business intelligence (BI) capabilities. APIs and SDKs were added to make it easy for developers to include data functionality into their applications — and for data scientists to tinker with the data. It’s also been described as taking the best from data warehouse and data lake models.

“We went to operating system books of the ‘70s and ‘80s, and said, ‘Why do people do caching? Let’s remove caching. Why do we have master-slave architectures of these databases? Why can’t we remove a Change Data Capture coming from a replica? Why do we have queues? …Three decades ago, we had the MQ series. Then we had RabbitMQ and now we have Kafka. But it’s still a queue.

“Then on top of it, we start getting these caching technologies … and you are just creating more and more layers of the same. … What enterprises need is a build product, which allows them to work and build applications in weeks. Right? That’s, that’s what we did.”

Isima competes with an array of startups focused on improving data management — and gaining real-time insight from data — including Dremio, Kasten, Incorta, and Quantexa.

Qubole’s Prateek Shrivastava has written about getting data lakes right and Splice Machine CEO Monte Zweben about operationalizing cloud data lakes.

Analyst Roy Chua, founder and principal at AvidThink, finds the Isima concept interesting.

Data analytics for enterprises hasn’t always delivered on its promises. From Hadoop to Apache Spark, not to mention Hive, Cassandra, Tensorflow, Kafka and other waves of approaches, from building data lakes to data pipelines, many enterprises have sought ROI from big data investments only to be disappointed. Part of this has been knowing which approaches to use for what purposes: batch ETL, streaming ETL, machine learning, deep learning and how to support use cases like ad hoc queries.

What piqued Chau’s interest in Isima were two things: the pedigree of its founders who had played roles at Datastax, Cloudera, AWS, and their holistic approach to the big data problem, Chua said. “I’ve heard from many organizations, from enterprises to telcos, that building large data lakes have left them in situations where they are storing more and more data but are not necessarily getting insights from that data. The dream use case of discerning some deep insight from analyzing a large pool of long-term data is seldom realized. The team at Isima’s observations that the faster data can be processed and acted on, the more valuable that analyses, is something that resonates with what I’m hearing from the market.”

Their approach of converging multiple data analytics use cases: ad hoc queries by users, deeper analysis and ML by data scientists, and programmatic API access for developers is ambitious and interesting, he said.

“Isima’s workflow is focused around fast ingest and data cleansing and enrichment capabilities on load, and quickly defining interesting features for business intelligence. With a short feedback loop to observe results in real-time or near real-time, Isima’s contention is that time-to-value for data analytics is much faster. They have a promising start with early trials across a range of verticals including telcos, financials and pharmaceutical retail, and their converged approach is one that I believe is worth watching.”

PharmEasy will join the Isima team for a webinar on Aug. 25.

Image by Anja from Pixabay.

A newsletter digest of the week’s most important stories & analyses.