This post is one in a series of tutorials and analysis exploring the fields of machine learning and artificial intelligence. Check back on Fridays for future installments.
Suppose we want to buy the best web camera available in the market. In real life, the process we’d follow would be to look at several product reviews describing qualities about the model we are considering purchasing. For example, if we see that the reviews mostly consists of words like “good,” “great,” “excellent” etc. then we’d conclude that the webcam is a good product and we can proceed to purchase it. Whereas if the words like “bad,” “not good quality,” “poor resolution,” then we conclude that it is probably better to look for another webcam. So you see, the reviews help us perform a “decisive action” based on the “pattern” of words that exist in the product reviews.
Hence, the relationship among the buyers who purchased the webcam and wrote product reviews will influence other buyers, and their product reviews, in turn, will influence future purchases. Thus, a pattern exists across the people who already purchased the product and the future buyers of the product.
Machine learning tries to encode this human decision-making process into algorithms.
Moving on from the example, let us look at the conditions that must be met before applying machine learning to a problem.
- A pattern must exist in the input data that would help to arrive at a conclusion. For instance, if we concluded the product reviews are random and do not offer any meaning, then it would be difficult to arrive at a decision by using them. To solve a problem with machine learning, the machine learning algorithm must have a pattern to infer from.
- There must exist an ample amount of data (examples, samples) to apply machine learning to a problem. If there are no product reviews for the webcam, it will be difficult to arrive at a decision whether or not to buy the product.
- It is difficult to formulate a mathematical expression ourselves that describes the behavior of the problem. Hence, machine learning is used to derive meaning from the data and perform “structured learning” to arrive at a mathematical approximation to describe the behavior of the problem.
Hence, if the above three conditions are not met, it will be futile to apply machine learning to a problem through structured inference learning. But if we fulfill the above three conditions, then we are good to proceed.
Machine Learning Components
Now, let us look at some of the components of machine learning, based on the product purchasing problem above. There are the product reviews, which serve as data to the machine learning algorithm. There is the output or the decision of whether the webcam is worth purchasing based on its reviews. Then there is the structured learning component performed by the machine learning algorithm to understand the pattern of the input data to give output.
The expression that the machine learning formulates is called as “the mapping function” and is used to learn the “target function.” The machine learning algorithm formalizes an expression that maps the input data to an output. In our example, a good review will help map (or correspond) to output, “buy the webcam” and a bad review will map to output, “do not buy the webcam.”
The target function is always unknown to us because we cannot pin it down mathematically. This is where the magic of machine learning comes in, by approximating the target function.
How Machine Learning Learns a Target Function
Hence, a machine learning performs a learning task where it is used to make predictions in the future (Y) when it is given new examples of input samples (x).
Y = f(x)
As you can see, we do not know any properties of the target function f. What is its form? Linear, non-linear? So we use machine learning to approximate this function by learning from examples (x). If we knew the properties of f, then there would be no need for learning from data and use machine learning. Instead, we could have used the target function directly by solving the equation. But in the product review example, the behavior of the target function cannot be described using an equation and therefore machine learning is used to derive an approximation of this target function. The target function tries to capture the representation of product reviews by mapping each kind of product review input to the output.
For doing this, the machine learning algorithm considers certain assumptions about the target function and starts the estimation of the target function with a hypothesis. The hypothesis might vary from time to time since the target function is unknown. Therefore, to arrive at a better function that approximates well the target function, some iterations of the hypothesis are done to estimate the best output. Hence, the hypothesis helps the machine learning algorithm to arrive at a better approximation of the target function in a shorter period, rather than letting the machine learning algorithm itself to figure out the whole thing by trying endless computations. This would take very long to arrive at a prediction.
Hence, the objective of all the machine learning algorithms is to estimate a predictive model that best generalizes to a particular type of data. Therefore, for solving a problem by machine learning, it is imperative to have a large number of examples that can be used by the learning algorithm to understand the system’s behavior and similar kind of predictions can be generated by the system when the machine learning algorithm is presented with new examples of data. Although the learning task is not easy, with a better understanding of the different components of the machine learning and how they interact with each other, things will become clearer. In the subsequent posts, we will look at how the machine learning algorithms can be used to solve real-world problems.