Modal Title
Data / Data Science

Better Data Prep and the Importance of Metric Grain

For many of customers, metric grain has been a pivotal feature for combatting a common issue — data distraction.
Sep 9th, 2022 8:35am by
Featued image for: Better Data Prep and the Importance of Metric Grain
Feature image via Pixabay.

Data prep is a problem that plagues almost every organization when it comes to the ability to understand and surface actionable insights from business-critical data. If you or your organization has experienced this, the good news is that you’re not alone.

However, in order for organizations to use business and decision intelligence tools to comprehensively understand and extract value from their data, that data must first be thoughtfully prepared.

When done correctly, data prep can reap a wealth of business benefits, such as enabling advanced analytics and decision intelligence tools to perform rapid, in-depth analysis to spur smarter decision-making.

Start with Your Metrics

Joel T. McKelvey
Joel is vice president of product and partner marketing at Sisu, the AI- and ML-powered decision intelligence engine that analyzes data at machine scale. A former product manager at Google and leader of product marketing at Looker, he has an extensive background in data and analytics, including business intelligence, database and data storage, and analytics deployment models.

Before starting data prep, the data team needs to consider goals — specifically, what the business is seeking to measure and what types of insights might be needed. To begin, focus on defining metrics and selecting the correct metric grain. Doing so sets your data team up for success to quickly analyze complex data, surface key drivers of change and turn those insights into actionable business decisions.

The structure of data often proves to be a challenge for data teams. Using traditional business intelligence tools, they are accustomed to manually combing through millions of rows and columns of complex data, requiring analysts to spend an exorbitant amount of time surfacing insights.

Thanks to advances in the modern data stack, there are now numerous tools available that leverage artificial intelligence, machine learning and natural language processing against these datasets to convert raw, unstructured and siloed data into structured, well-defined data. Tools such as AI/ML-powered analytics can provide programmatically defined measures of data relevance, based on the metrics the data team defines.

In May, Sisu launched its first integration with dbt Labs to support our customers with metric definitions in dbt Cloud’s metric metadata layer. Data governance has traditionally been a static process — cut off from the constantly changing reality of a business, which can negatively affect the efficiency, trustworthiness and impact of data initiatives.

Metric configurations must be consistent and current across an organization to ensure accurate and thorough analysis that meets business needs. However, this typically manual process takes up valuable time from data teams and leaves room for human error to trickle in. With Sisu’s dbt integration, customers are able to leverage the metrics they’ve already created in dbt to run more comprehensive, efficient and faster machine-learning-powered analyses directly within the Sisu Decision Intelligence Engine.

The Challenge of Metric Granularity

Metric grain refers to what each row of data represents in a set, so if a data team doesn’t choose the proper input when analyzing data, their analyses will surface inaccurate results. This is what we like to call GIGO: garbage in, garbage out. Selecting the right grain for your metric can be difficult, especially when trying to ensure you’ve captured all the relevant dimensions for accurate decision-making.

The granularity issue can often become a roadblock for businesses, as manually determining the granularity demands of an analysis depends on the metric you’re exploring, and the dataset must also be at the right grain size for that metric.

Let’s say you’re a store owner keeping track of your week-over-week sales performance data, and your smallest grain size is a week. This means that a week is as precise as your data gets (your data is preaggregated to weekly grain), making it impossible to analyze how hourly and daily performance affect weekly performance. Of course, you can easily find the average sales of each day through your weekly sales data, but you will never be able to look at an individual day’s performance, limiting what you can learn and do with your data.

For grain, finer is generally better when conducting comprehensive analyses. Ensuring that your organization has analysis-ready datasets that preserve as much grain as possible protects against unexpected or unpredictable future analyses. To learn more about best practices for selecting the best metric grain, check out Sisu’s datasheet on choosing the right grain for your KPI metric.

Once you understand the grain of your dataset, create the right metric and define your datasets to support the necessary metric grain, you can then turn your focus to iterating quickly on existing datasets and metrics to determine the root cause behind changing key metrics, such as weekly sales performance. This is where Sisu’s metric grain feature can come in handy.

Using Sisu’s metric grain feature, data teams can uncover the best key driver results faster and adjust metric granularity depending on new questions being asked — a manual process that would take analysts weeks to accomplish with traditional BI tools. Sisu’s ML algorithms can also detect, validate and model the dataset’s primary keys and surface them as your grain in metric setup, allowing analysts to avoid preparing datasets at all grains upfront. Users can also adjust granularity to produce new aggregated datasets in metric setup, removing the fuss of manually scaling your data based on independent metric needs.

For many of our customers, metric grain has been a pivotal feature for combatting a common issue — data distraction. For example, an online marketplace started out with 5.5 billion transaction records and several thousand columns of data. Using Sisu, they were able to cut that initial amount down to the 250 million most relevant records, eliminating hundreds of columns and a significant amount of unnecessary noise.

They quickly identified the key drivers affecting their most important metrics and saved considerable time and resources. This, in turn, helped their data team avoid hundreds of hours of manual data combing and investigation, and made the insights affecting their metric change all the more relevant due to the speed at which they were able to be identified. We’ve seen customers, such as Overstock and HomeToGo, report up to 80% acceleration in data investigation and speed to insights.

Using tools like Sisu and dbt to thoughtfully prepare and understand the underlying causes affecting key business metrics, data analysts can answer important business questions from stakeholders more thoroughly, and in some cases, uncover key drivers of change that may not have been visible from business context alone. In this new era of harnessing data to make informed business decisions, using tools that allow your data to work for you will be what separates your business from the competition.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.
TNS owner Insight Partners is an investor in: HomeToGo.