What’s the most challenging stage of the machine learning (ML) lifecycle? Data gathering and cleaning has traditionally been the most time-consuming aspect of data scientists’ and analytics practitioners’ jobs. But as solutions have popped up to address this issue, the bottleneck has moved to creating and deploying ML models.
The jury is still out on how many stages there are in the standard ML lifecycle, but for sure getting started is not a problem. Executives are throwing money at projects. Although a lot of projects have failed at the proof-of-concept phase, others have found success in identifying real business goals and establishing data science teams.
Actually building and evaluating machine learning models is the core stage of the ML lifecycle. But according to Algorithmia’s “2021 Enterprise Trends in Machine Learning,” once a use case is actually defined, it takes 66% of organizations more than a month to develop an ML model. For 64% of organizations, it takes at least another month to deploy that model. Per the report, most data scientists spend at least 25% of their time deploying models. The machine learning engineers described in these charts are often deploying into test environments as well as into production. When assessing additional studies it is important to realize that respondents may not understand the distinction between these two types of deployments.
Once a model is served into production, it is monitored in regards to DevOps and IT-related performance metrics, but also to make sure its accuracy doesn’t degrade over time. Retraining models, audits, tracking proper security and governance and live A/B tests are all iterative steps that can feed back into earlier stages of the lifecycle.
At the end of the day, data scientists analyze and understand data to influence decisions. In recent years, they have been less likely to “waste” their time acquiring or cleaning data, but have become more proficient at writing their own software to automate deployments and workflow. If ML platforms and popular projects can abstract away some of this work, then data scientists and machine learning engineers can spend more time creating value for their organizations.
Feature image via Pixabay.
The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Real.