Culture / Data

The Benefits and Drawbacks of DataOps in Practice

12 Nov 2021 7:00am, by

DataOps does what it is supposed to but may introduce new challenges according to our analysis of the raw data from a survey on behalf of data.world and DataKitchen.

  • Collaboration on data modeling and management is twice as likely to be “very effective” at companies with mature DataOps practices compared to those where the practice has only been partially implemented.
  • The stress data engineers face can be overstated. Leaders of Data & Analytics teams face a much heavier burden due to the evolving data ecosystem.

Two-thirds of the study participants in August’s Data Engineering 2021 Survey develop, maintain, and optimize data systems to make data available for analysis. Thirty-seven percent said their company has mostly or fully implemented DataOps into their data processes and another 41% had a partial or emerging practice.

Collaboration between different groups for data management and building data models is only very effective at the 23% of companies that have partially rolled out DataOps practices. The success rate doubles for mature DataOps practices, going to 46% of respondents claiming very effective collaboration. Furthermore, when asked about cutting-edge tools, people at these companies are twice as likely to be dismissive of the technologies if they are integrated into the full DataOps lifecycle.

Integrating technologies and processes, and enabling communication can have unintended consequences. For example, at companies with mature DataOps practices, 36% of respondents say data governance policies may their day-to-day jobs very difficult while only 12% of those at firms that are at an earlier stage in the DataOps process feel this way. It appears that DataOps means companies are enforcing rules, which is not always fun.

46% of Mature DataOps Practices 2x More Likely Very Effectively Collaborate Across Groups Than Partial Adopters of DataOps

Respondents that reported having fully implemented DataOps are more likely than others to always get data requests with unreasonable expectations. Over a third of assignments given to team leaders responsible for data analytics and data management can not be completed given the features and functions that were asked for. Data engineers and enterprise architects are given more realistic tasks, partially because they get to define the scope of the projects.

Unreasonable deadlines face almost everyone in the DataOps workflow but only 19% say unplanned work always disrupts their work-life balance. Leaders of Data and Analytics teams (36%) and those at mature DataOps practices (30%) were more likely to always face disruption and the consequent risks associated with burnout.

Data Engineers and Architects Can Fulfill Requests for Features More Often Than Other Job Roles

Although unreasonable expectations are often a burden, enterprise architects and data engineers have more circumscribed job responsibilities.

DataOps and the Job Market

Here is a red flag for employers that are worried about retaining and attracting talent — 43% of those at companies with mature DataOps practices are very likely to depart for another data engineering job in the next 12 years. That’s on par with 44% of management-level respondents looking for an exit. Yet, excluding management, only 17% of those surveyed are very likely to leave.

We do not put too much weight into the “somewhat likely” to this question because almost everyone employed today is at least considering taking a job. In another study conducted this year by SlashData, only 10% of developers surveyed said that nothing could motivate them to change employers, but the usual mix of compensation, benefits and work-life balance are catalysts for job movement. Another problem is toxic work environments, and this doesn’t need to be the result of gender or racial discrimination. Oftentimes the problem is that no one claims responsibility for problems or that certain groups are blamed when things go wrong.

The data engineering regularly gets blamed when things go wrong with the company’s data analytics. In fact, 21% say this always happens and another 42% believe it occurs often. Sometimes the blame may be deserved, but when undeserved quitting is an attractive option. Unsurprisingly, when data engineering is always blamed for the Data/Analytics problems, 65% are very likely to leave.

How likely are you to leave your current company for another data engineering job in the next 12 months?

Unplanned work, unreasonable requests and blame for things out of your control. That’s enough to make a lot of people wish their job came with a therapist to help manage the stress. In fact, 99% of the group strongly or somewhat agreed with that statement at companies where data engineers are always blamed. In contrast, only 49% feel that way when data engineers are not scapegoats. These days that counts as good news.

How likely are you to leave your current company for another data engineering job in the next 12 months?

DataOps in Other Studies

According to a survey of 525 practitioners working at enterprises with more than 1,000 employees conducted by 451 Research on behalf of Immuta in May 2021, 41% have a DataOps strategy that is either ingrained in the company culture or has been fully operationalized is delivering value. The data governance vendor Immuta was busy sponsoring research this year as it also worked with Gradient Flow to produce the 2022 Data Engineering Survey Report, based on 372 respondents contacted between June and August 2021. This report used an identical measure of DataOps maturity and found that an almost identical 40% were either at an optimized or accelerated (mature) state — the only difference is that small companies have been less likely to begin putting a DataOps strategy in place.

How you would characterize your organization’s level of maturity with respect to ‘DataOps,’ defined as applying agile and automated approaches to managing data (similar to how DevOps applies agile and automated approaches to manage software delivery).

“DataOps Dilemma: Survey Reveals Gap in the Data Supply Chain” is based on responses from 525 enterprise practitioners in the US, Canada, UK, Germany and France in May 2021. Participants worked on organizations with more than 1,000 employees that utilized some form of cloud data platform.

Favorite Technologies at Mature DataOps Companies

The 2022 Data Engineering Survey Report compared technology adoption based on how far along companies were on their DataOps journey. The products chosen by companies that have embraced DataOps were significantly different in several cases. It is worth investigating further to determine if these technologies will coalesce into a new stack that will be used more broadly if DataOps processes become more popular. Here are two takeaways:

  • Integration Tools: Dataform is used by 25% of the mature DataOps cohort Purchased by Google a year ago, it provides a framework for managing SQL-based data operations in BigQuery, Snowflake, and Redshift. Airbyte and AWS Glue are also used by 20% of the mature DataOps group.
  • Workflow Management and Orchestration: Usage of Prefect increases as the DataOps adoption practices become more widespread. Thirty-five percent of mature DataOps companies are using Prefect, which is not resting on its laurels as it just released Orion, a second-generation its workflow orchestration engine. The CNCF-hosted Argo Workflows is used by 22% among the most mature DataOps companies. With Luigi and Dagster also being used it there still appears to be choices about how to orchestrate MLOps pipelines.

What tools does your organization use to integrate (extraction, loading ingestion, transformation, modeling, data management) for BI, analytics and data science?

What tools does your your organization use for workflow management and orchestration?

The feature image is an illustration from Ascend.io’s DataOps.dev.