What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
Tech Life

How to Build a Data Science Enablement Team

Fostering a productive relationship with your data science team is essential. One way to do it is through the creation of a Data Science Enablement Team.
Jun 6th, 2022 6:12am by
Featued image for: How to Build a Data Science Enablement Team
Feature image via Pixabay

As a developer exploring new ways to take advantage of MLOps in your data environment, you may find that overcoming cultural and communication hurdles can be just as challenging as determining which technologies to use.

That’s because building and deploying data-driven applications do not just require knowledge of popular open source projects and tools. The practice also requires the ability to openly and regularly communicate and collaborate with your data science team.

That may not be something you’re accustomed to doing. However, fostering a productive relationship with your data science team isn’t just possible, it’s essential. One way to do it is through the creation of a Data Science Enablement Team (DSET).

What Is a Data Science Enablement Team?

An enablement team is a type of Center of Excellence built specifically to encourage and facilitate communication and collaboration — in this case, between developers and data scientists. In the data science enablement team, developers and data scientists work together throughout development to ensure data models are viable for applications.

A DSET helps create a governance model for building data-driven applications and moving them from data ingestion to delivery while bringing data scientists into the development fold.

The result is a win/win for both parties. You can achieve faster development cycles, and your data science colleagues can see that their models are put to good use. Everyone involved can benefit from learning from each other, including new ways to optimize workflow processes and gain better efficiencies.

How to Set Up Your Own Enablement Team

Audrey Reznik
Audrey is a senior principal software engineer in the Red Hat Cloud Services — OpenShift Data Science team focusing on managed services, AI/ML workloads and next-generation platforms.  She has been working in the IT Industry for over 20 years in full-stack development to data science roles. She is passionate about data science and, in particular, the current opportunities with ML and open source technologies.

Setting up your own enablement team doesn’t have to be complicated. All it really takes is a desire for you and your development team to build a bridge between your group and your data scientists.

Since all work within an MLOps process flows through development, you’ll serve as the focal point for your organization’s DSET. Your goal will be to help your data scientists put their models into action.

In addition to working on your own AI-enabled applications, you’ll scope out data development projects within your company and offer your services to those projects. That could involve everything from showing data scientists how to use GitHub, teaching them how to perform peer-to-peer coding, or just being there to answer questions and talk to them about how to turn their projects into deployable solutions.

By offering to help, you set the stage to bring data closer to the applications you’re building. Models and applications can be built in conjunction with each other, which can result in a more seamless, agile and secure development process.

How to Work with Data Scientists

This unique approach upends the traditional methods through which you would typically work with your organization’s data scientists. Normally, that work happens in siloes — the scientists build their models and send them to you to insert into your intelligent applications. Then everyone continues on to the next project.

Break down these siloes by reaching out to your data scientist colleagues. That could come in the form of a simple direct message with an offer to assist or something more personal, like a coffee or lunch-and-learn invitation.

However you decide to approach them, there are a few things to keep in mind as you commence your collaboration:

Empathize with data scientists’ work processes.

Data scientists may use processes and tools you’re unfamiliar with, and those processes may not initially jibe with your own. For instance, data scientists may not think twice about emailing you code via Jupyter Notebooks. Or, they might use different versions of Python to create base images, with none in synchronization with each other.

Consider offering alternatives to help them improve their workflows (and make your life a bit easier). For example, help them organize what they’re working on by setting up a Jupyter Hub instance or git repository. Making their jobs easier will help build the relationship.

Encourage them to be more developer-friendly (without asking them to become developers).

Most data scientists don’t want to become software developers any more than you probably want to become a data scientist. But bringing them into the DSET isn’t about getting them to learn more about software development — it’s about helping both you and them become more cognizant of the processes you both adhere to. So, while you’re empathizing with their work patterns, get them to understand how adopting some of your processes can help them in their daily workflows.

For example, talk to them about the need for reproducibility and source control, and how both of these things are important to the long-term viability of their models. Help them understand how to make their work have long-lasting value.

Establish a regular cadence of communication.

Establish regular, brief chats, if not daily, then at least once a week. Consider these meetings the enablement team version of a scrum, where everyone runs down the projects they’re working on, provides status updates and asks questions. It’s also a good idea to set up a team Slack channel or SharePoint site that allows everyone to access materials, including tutorials and demo content.

You can even start up a grassroots network to showcase the work your enablement team is doing. For example, a former colleague of mine once created a Python Interest Network, which we used to display the work our team was doing with on-premises and public cloud solutions. It gave everyone the chance to share ideas and learn from each other.

Don’t forget to support your communication and collaboration with an underlying technology environment that allows everyone to easily access the tools they need to complete their projects. Whether it’s Python, Jupyter, Apache Kafka or a similar solution, make them openly available. Don’t let lack of access to technology prevent your newly established DSET from achieving its goals.

What Happens Next?

While creating a DSET isn’t necessarily hard, it may take some time before your relationship-building efforts start to yield efficiency gains. There will likely be some form of a feeling-out period that everyone goes through as people learn each other’s work habits and get to know each other.

The recommendations I mention here come from experience. In my job prior to working at Red Hat, I was a data scientist. I helped my fellow data scientists deploy their models due to my previous developer experience. I was asked to put together an enablement team consisting of developers and MLOps engineers who would work with the scientists to help them navigate source control, containerization, AIML governance, model deployment and intelligent application development.

It worked extraordinarily well, to the point that we had other parts of the organization wanting to learn how we enabled our data scientists. But our success didn’t happen overnight. It took time, and it will take you time too.

The goal is to begin seeing improvements within a few months. Ideally, you’ll see reduced production times and faster deployment cycles — things that are easily tracked, benchmarked and reported to your manager.

If you hit your numbers, you’ll have done two exceptionally important things. First, you’ll have successfully shown the value of data science to your organization. Second, you’ll have integrated the power of AI into your application development.

And, hopefully, you’ll have made some new friends along the way.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.