Acryl Data Unveils Data Observability Capabilities, Adds Funding
Yesterday, Acryl Data announced the launch of Acryl Observe, a data observability module for its flagship Acryl Cloud offering. The company also received $21 million in Series A funding. Acryl Cloud is a data management platform positioned atop the DataHub Project, an open source metadata platform, data catalog, and control plane for data.
Acryl Observe is currently in private beta for use by what Acryl Data CEO Swaroop Jagadish termed “early design partners.” The funding was provided by 8VC partner Bhaskar Ghosh, Sherpalo Ventures founder Ram Shriram, and Vercel CEO Guillermo Rauch.
The addition of the data observability layer to Acryl Data’s stack enables organizations to uniformly access data governance, observability, and data management capabilities in a single solution. Traditionally, governance and data observability “have been needlessly seen as separate problems,” Shriram pointed out. “Business users, at the end of the day, look for a unified reliability indicator and, by bringing governance and data observability together, the technical and the business users come together much more.”
Moreover, by providing these capabilities for contemporary decentralized architectures such as data mesh, organizations can monitor, validate, and improve their data products at the pace of contemporary business.
The Data Observability Engine
Acryl Data enables users to monitor their data health and detect incidents with low latency. According to Acryl Data CTO Shirshanka Das, “This is different from classical approaches because this is more real-time, event-oriented streaming metadata. We’re getting every single Spark job that is running. You’re getting continuous data profiling insights.” This capacity becomes even more useful with the automation characterizing the data observability layer.
The solution’s robust data discovery mechanisms employ machine learning to scrutinize historical data patterns and establish a baseline for what healthy data looks like. The results form the basis for automatically generated suggestions for data contracts for specific datasets — which users can modify or supplement with business logic. “With shift-left approaches, that is the central and hardest problem to solve because otherwise, data producers are going to always be lagging,” Das reflected. “So, we help them by suggesting what responsibilities they should be signing up.”
The Data Control Plane
Coupling these data observability capabilities with the metadata management features of DataHub proves mutually beneficial. The tandem provides a simplified architecture for addressing the increasingly distributed data paradigms characterizing modern organizations investing in data fabric and data mesh approaches. It also delivers a rich metadata foundation replete with business definitions, semantic clarity, and rules with which to contextualize and monitor the data via the data observability capabilities. “The metadata control plane gives us that fire hose of events that we’re able to continuously monitor and detect, whether it’s a data quality incident or a divergence in a certain distribution,” Das remarked.
DataHub’s data catalog and metadata platform is the substrate for merging business understanding of data with technical characteristics to inform the data contracts upheld by the data observability layer. Acryl Data’s stack relies on several components that underlie the data control plane, including a key-value store, a Kafka integration for data streams, and an Elastic integration to make contents almost instantly searchable. The metadata itself is connected via a knowledge graph for a heightened understanding of connections, implications, and use of data elements.
For organizations with data assets decentralized throughout multiple clouds, repositories, and tools, or simply those that have assigned respective business domains ownership of data, this approach is timely. Either way, users can analyze their metadata to optimize the process of creating reusable data products, then conveniently monitor them for data reliability and optimal data health with the data observability module.
“You can look at the existing technical lineage graph and then say, ‘Oh, these things belong as a single data product,’” Das commented. “Or, you can come in with a clean opinion of what a data product is and you can define it using your favorite declarative language. We have GitHub Actions and things like that to translate and provision that into Acryl Cloud.”
Acryl Data plans to channel its recent funding into better understanding and servicing its community of users. “We’re investing in our core community,” Jagadish acknowledged. “The 7,500 plus people that we have in our community give us advantages. We learn at scale and improve our product rapidly.”
The company also seeks to democratize, if not evangelize, the utilitarian data control plane it’s championing. “In terms of the control plane vision, we are executing on the vision there,” Jagadish said. “There are some concrete and practical use cases we are tackling with data contracts and data mesh.”