Data / Contributed

Data Models: A Key Step on Your Data Journey

2 Sep 2020 10:27am, by

Once you’ve decided to migrate your organization’s ever-expanding volumes of data to a cloud data warehouse (CDW), getting more business value from that data requires you to keep track of data sources and their data flow. The way to do that is to build a data model inside your CDW so you can find your data assets quickly and leverage them efficiently.

What Is a Data Model?

A data model describes your entire enterprise from a data point of view. It’s a representation of the data objects in your CDW, the associations between these objects, and the rules you’ve created to manage them. Building a data model ensures that you can identify and generate the outputs you actually need.

Benefits of Having a Data Model

A CDW is a live, evolving representation of the processes that generate data to drive your business. Creating a data model lets you see and simplify the complex data flows in your CDW so your data team and business users can:

  • Understand how both relational and non-relational data flow through the environment
  • Highlight business rules
  • Speed time to market for long-term data assets
  • Identify ways to improve scalability and performance
  • Create visual documentation to show how actual data elements and business processes relate to one another
  • Promote data democratization

Types of Data Models

Shawn Johnson
Shawn Johnson is a Solution Architect at Matillion, a leading provider of data transformation for cloud data warehouses. Shawn is an experienced System Analyst with a demonstrated history of working in the financial services industry. He is skilled in Oracle Database, Requirements Analysis, Agile Methodologies, Extract, Transform, Load (ETL), and Databases. He has a strong information technology background with a Master's degree focused in Computer Science from Regis University.

A conceptual data model describes concepts, rules, and processes that support the business, identifying the data elements that drive them and tracking business events and their related performance. A conceptual model does not involve process flow or the characterization of data types.

A logical data model applies the conceptual data model to describe the steps of a specific process as it relates to other business processes within the organization, including relationships between concepts and the data attributes the process generates. Logical data models are typically split by subject area.

A physical data model is the actual representation of the data elements of a process, grouped by the entity for storage in a database management system (DBMS). The model contains all relationships and entities, data types, keys, foreign keys, indexes, and any other DBMS feature that would be included in the schema build. A physical data model derives from the conceptual and logical models, making it a third opportunity to review the build requirements. Unlike the previous models, it also includes the entities and relationships that support performant data storage and retrieval of enterprise data.

Making Sense of Data Objects

To optimize your CDW investment, you need to make the data meaningful to the end user. To do this, you need to turn your data into relevant data objects — that is, groups of data points that are ready for consumption by another system. To create a data object called “customer,” for example, you will gather every attribute you’ve collected about your customers, such as name, contact information, and purchase history, and normalize them into a format that can be shared across systems.

A data model streamlines your ability to identify where data is coming from and how best to use it to create and refine data objects. You can then sort those objects into datasets for analysis that deliver the insight to drive strategic decisions. The data objects you create will depend on your business goals and the specific actions your data model will drive. For example, you might use a sales database to generate data objects such as customers, sales, and inventory, then slice and dice combinations of data objects across different attributes such as date, store location, and geographic region to determine where your sales are best and how you might replicate that success in other stores.

Using a Data Model to Create an Effective CDW Build

With a data model to give you a better understanding of all your data sources and their structures, you can tailor your environment to support scalable solutions that manage costs over time while delivering higher value to data users and the organization and a whole. Leveraging the flexibility, scalability, and agility of cloud resources lets you engineer the various data layers in a CDW to optimize your resources, time, dollars, and effort.

The Role of Data Models in Data Democratization

Building a corporate data model that includes all of your DBMS gives you a deeper understanding of where your data is, where it comes from and how it’s used, and the relationships within it. This gives your organization the confidence to introduce and expand data self-service, making more data available to more users in less time and ensuring that it’s curated so that they only draw on data that is relevant to their needs.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.