5 Essential Pieces of the Deep Learning Puzzle
Deep learning is quickly becoming one of the most sought-after applications of computer science, and it’s no wonder why. Deep learning allows computers to learn and make decisions based on massive datasets. It enables companies to attack otherwise impossible problems in domains like speech recognition, computer vision, natural language processing, and more.
Yet many companies struggle to solve the deep learning puzzle and to capitalize on its full potential. It can be expensive to build and maintain, it requires specialized expertise, and it is still a fairly new concept to the private sector. As companies start to explore this powerful technology there are different ways to make sure they get the most out of it.
For companies who want to implement deep learning, there are five pieces they must have in place to ensure they are making the most of it.
1: A Way to Quantify Success
First, before any modeling efforts are started — whether they are in deep learning, more traditional machine learning, or data science in general — it is important to quantify a measure of success. This process involves determining the metrics that these models and efforts will track and be judged by. Ideally, there is a way to trace the performance of a model directly back to business value. As a fraud detection model gets more accurate you may lose less money to fraud. As a recommendation model gets better you may sell more products. Sometimes this is more difficult when there are many steps between a specific model and explicit business value, but it is an extremely important step.
Ideally, there is a way to trace the performance of a model directly back to business value.
Without a metric tied to business value, it is impossible to judge whether you are making progress or if the model itself has any value at all. Even in circumstances where it seems straightforward, this can be difficult. In the case of fraud, it is important to make sure the definition of accuracy matches what the business needs. Catching 99 percent of fraud is good, but if all of the 1 percent that slips through is $10 million or more then it is very bad for the business. This measurement of success is always incredibly domain specific and will be different for every business and application.
Not defining these metrics or using standard academic measures can lead to wasted time, lowered business outcomes, and models that are built with no real purpose. Without the ability to measure performance of these methods and relate that to business value business can waste a ton of time and money to effectively build very expensive algorithmic toys.
2: An Abundance of Clean Data
The amount of applications and Internet of Things (IoT) devices in the enterprise means there’s no shortage of data. But many datasets can’t be used for deep learning because they’re too small, not structured, or untrustworthy. If machine learning models are trained with incomplete, unstructured, or inaccurate data, they can’t arrive at the best outcomes at best, and can even actively hurt the underlying application they are being applied to.
For example, imagine you’re training a neural network to classify images between dogs and cats. To do that, the neural network must be trained on many images that are labeled with the corresponding right answer. By feeding the machine enough images of different types of dogs and cats, the neural network will eventually be able to learn patterns to accurately identify the differences. If you only have a few images, only a single breed is included, or if the data is mislabeled the resulting model will not achieve the intended general purpose. Companies must ensure they have enough data that is clean and clearly labeled to improve the accuracy of their deep learning models. This step must be achieved before any training or learning is possible.
3: A Deep Learning Framework
Deep learning frameworks are the libraries and programming models upon which developers build deep learning applications. Frameworks are necessary for effective deep learning — without them, companies can’t do things like image and speech recognition or language understanding. However, it’s expensive and time-consuming to build these frameworks from scratch.
Luckily, tools like Tensorflow, MXNet and Caffe2 provide ready-to-use, cloud-based deep learning frameworks that are easy to implement. The frameworks come with open source examples, like natural language processing systems and image recognition systems. If a company has the data to train the model, it’s fairly simple to plug in the data without having to know how the underlying framework was built. When choosing a deep learning framework, companies should consider factors like available resources for development issues, ease of use and deployment, and customization. This step must be achieved before you can build your first model.
4: A Robust Cloud Infrastructure
Deep learning requires extreme computational resources. Traditional IT infrastructure is inadequate for deep learning because it can’t process large volumes of data required to drive insights. GPU-enabled cloud infrastructure on platforms like Amazon Web Services and Microsoft Azure has made it easier than ever for companies to build and scale deep learning pipelines. GPU-based infrastructure can offload the massively parallel, compute-intensive portions of an application to the GPU, while the remainder of the code remains on the CPU, enabling applications to run up to 50 times faster and more effectively in practice. This step must be achieved before you can scale your model to production level datasets and enterprise applications.
5: A Way to Optimize the Models
Once you have the data, the frameworks are in place, and the hardware infrastructure is robust, the final step keeping deep learning performance from achieving its true potential is model optimization and hyperparameter tuning. For any deep learning model, researchers need to set dozens of hyperparameters before they can input data to train the model. These parameters include things like the number of layers in the network, the number of nodes per layer, or the learning rate. If the initial training session performed poorly, researchers will often tune a few knobs and give it another go or they may rely on expensive, brute force methods like grid search. These approaches to tuning deep learning models are time intensive and offer no guarantee of good results.
To make this easier on developers, data scientists and researchers, software-as-a-service solutions like SigOpt, Amazon SageMaker, and Google Hypertune exist to take the guesswork out of hyperparameter tuning. These optimization tools sit on top of models and observe their inputs and outputs to help researchers identify where they need to adjust their systems. Optimizing machine learning algorithms and deep learning frameworks can provide many orders of magnitude improvements in accuracy, performance and efficiency. This step allows you to get the most business value out of your deep learning investment.
Take These Steps
Deep learning is still gaining a foothold among large companies and it can be puzzling to know where to start. There are many tools available to ensure companies can capitalize on all deep learning has to offer. The technology will continue to transform how businesses operate, and as it does, companies will need to have these pieces in place to ensure they don’t get swept away by the competition.
By following these steps companies can make sure their models are aimed in the right direction, and their deep learning efforts are possible, scalable, and optimal.
Feature image via Pixabay.