Sundeck Launches Query Engineering Platform for Snowflake
Sundeck, a new company led by one of the co-founders of Dremio, recently launched a public preview of its eponymous SaaS “query engineering” platform. The platform, which Sundeck says is built for data engineers, analysts and database administrators (DBAs), will initially work with Snowflake‘s cloud data platform. Sundeck will be available free of charge during the public preview; afterward, the company says it will offer “simple” pricing, including both free and premium tiers.
Sundeck (the product) is built atop an Apache-licensed open source project called Substrait, though it offers much additional functionality and value. Sundeck (the company) has already closed a $20M seed funding round, with participation from venture capital firms Coatue, NEA and Factory.
What Does It Do?
Jacques Nadeau, formerly CTO at Dremio and one of its co-founders, briefed the New Stack and explained in depth how Sundeck query engineering works. Nadeau also described a number of Sundeck’s practical applications.
Basically, Sundeck sits between business intelligence (BI)/query tools on the one hand, and data sources (again, just Snowflake, to start) on the other. It hooks into the queries and can dynamically rewrite them. It can also hook into and rewrite query results.
One immediate benefit of the query hook approach is that it lets customers optimize the queries with better SQL than the tools might generate. By inspecting queries and looking for specific patterns, Sundeck can find inefficiencies and optimize them on-the-fly, without requiring users, or indeed BI tools, to do so themselves.
Beyond Query Optimization
More generally, though, Sundeck lets customers evaluate rules and take actions. The rules can be based on the database table(s) being queried, the user persona submitting the query or even properties of the underlying system being queried. This lets Sundeck do anything from imposing usage quotas (and thus controlling Snowflake spend); to redirecting queries to different tables or a different data warehouse; rejecting certain high-cost queries outright; reducing or reshaping a result set; or kicking off arbitrary processes.
In effect, Sundeck takes the call-and-response pipeline between client and database and turns it into an event-driven service platform, with a limitless array of triggers and automated outcomes. But that’s not to say Sundeck does this in some generic compute platform-like fashion. Instead, it’s completely contextual to databases, using Snowflake’s native API.
With that in mind, we could imagine other applications for Sundeck, including observability/telemetry analytics, sophisticated data replication schemes and even training of machine learning models, using queries and/or result sets as training or inferencing data. Data regulation compliance, data exfiltration prevention, and responsible AI processes are other interesting applications for Sundeck. Apropos of that, Sundeck says its private result path technology ensures data privacy and that its platform is already SOC 2-certified.
In the Weeds
If all of this sounds a bit geeky, that would genuinely seem to be by design. Sundeck’s purpose here was to provide a user base — that already works at a technical level — access to the query pipeline, which heretofore has largely been a black box. This user audience is already authoring sophisticated data transformation pipelines with platforms like dbt, so why not let them transform queries as well?
It’s no surprise that Sundeck is a product that lives deep in the technology stack. After all, Nadeau previously led similarly infrastructural open source projects like Apache Arrow, which provides a unified standard for storing columnar data in memory (and which Nadeau says is an important building block in Snowflake’s platform), and Apache Drill, which acts as a SQL federated query broker. The rest of the fifteen-person Sundeck team has bona fides similar to Nadeau’s, counting 10 Apache project management committee (PMC) leaders, and even co-founders of Apache projects, like Calcite and Phoenix, among its ranks.
- How Apache Arrow Is Changing the Big Data Ecosystem
- Apache Drill Eliminates ETL, Data Transformation for MapR Database
Sunny Forecast on Deck?
If data is the lifeblood of business, then query pathways are critical arteries in a business’ operation. As such, being able to observe and intercept queries, then optimize them or automate processes in response to them, seems like common sense. If Sundeck can expand to support the full array of major cloud data warehouse and lakehouse platforms, query engineering could catch on and an ecosystem could emerge.