The approach to taking an existing model and applying it to a different problem is called transfer learning. You might be in the middle of training a model and then the business problem shifts. Now you have this model that has been going through the training process with a specific dataset and you need to adapt the model to handle this new problem. For example, you might be working on an image classifier for parts in a manufacturing plant and the number of parts you have to classify increases.
Your existing model has already learned how to handle a few images, so you don’t have to restart the entire training process. With more experimentation, you can take the original model and make it more generic to handle other types of classification. This is something that commonly happens when machine learning engineers get updated information from the product team or stakeholders have given more feedback on what the model should be capable of.
Why We Would Need to Expand a Model
For example, let’s say we had a Natural Language Processing (NLP) problem where we were getting the context of a user’s words and we’ve added more words. Here are a few reasons we would want to apply transfer learning to a pre-trained model.
- It saves time and resources compared to training a new deep learning model for related tasks
- You can find a number of pre-trained models that use industry-accepted datasets so you know they are high quality
- It gives you as much value as possible for your existing models
- You can prevent your production models from going stale
As we learn new things from our users and get a better understanding of what matters to them, we need to update our models. That doesn’t mean we take all of the insights gained from the old model and throw them away. We add on to them to factor in the new information that matters.
Transfer learning is also extremely useful when you don’t have much data to start training a new model with. Using transfer learning on a pre-trained model can help create models that you might not have been able to due to a lack of initial data. How?
There are a number of pre-trained models you can choose from. Here are some of the more popular ones for image classification and NLP applications.
NLP pre-trained models
Image classification pre-trained models
One of these models would be fine-tuned to fit the specific type of project you are working on. This could involve freezing the weights of the first few layers of the pre-trained or using a smaller learning rate to train the model. You’ll be able to take advantage of all the data that these models have been trained with and relate those insights to your own specific models.
There are a wide range of use cases where you can use transfer learning with pre-trained models.
- You might want to build a custom sentiment analyzer to figure out what your customers are saying in reviews, but you don’t have enough data yet. This is when you could use a pre-trained model to get some sentiment insights on a much smaller dataset.
- You could be responsible for building an object identification system for the cameras on some street intersections. It could take months to build an accurate model from scratch when you could fine-tune an existing model to look for specific objects, like cars, pedestrians, and bicyclists.
- You may work on a project where you need to determine whether an email is from a human or a bot. This might involve looking for the context in the words that are in the email to see if the message makes sense. Taking an NLP pre-trained model could shorten the time it takes to build an accurate model.
There are other areas where you can make use of pre-trained models, just be cautious with the sources you use. Make sure that any pre-trained models you’re working with have gone through extensive testing so that the claims they make for accuracy hold up.
There are a lot of benefits to building your models from scratch, but in some cases, you can save a lot of time and resources by starting with a pre-trained model. Then you spend time fine-tuning the model to fit the specific needs of your project. You can find pre-trained models for NLP projects and image classification projects fairly easily and if you look around, you may find some good ones for other types of projects.
Use the tools that are already available so that you can focus on the more important parts of the project, like getting meaningful results from the model in production. As you run training experiments to see how to best fine-tune your model, make sure you’re using a tool that will let you go back and reproduce those experiments.
Sometimes you stumble across a really good model, but you weren’t paying close attention to the changes you were making. A tool like DVC makes it easy to track all of the changes you make for each experiment so you can come back to them anytime.
Feature image via Pixabay.