What Is Machine Learning? An Introduction
When most people hear “Machine Learning”, They tend to picture a robots taking over the world as seen in Sci-Fi movies. However, Machine learning is not just a futuristic fantasy, It’s already here. In fact it’s been around for too long within specific applications that have impacted all of our lives. The first ML application to go mainstream that have affected millions and millions of users is the spam filter. It took over the world back in 1990s. The spam filter qualifies as a Machine Learning application. It has learned so well that you rarely ever need to flag an email as spam. In this article, we will give an introduction to machine learning and explore the following:
- What is Machine Learning?
- Why Use Machine Learning?
- Traditional Programming VS Machine Learning
- Recourses & References
What Is Machine Learning?
Machine Learning is the art and science of programming computers so they can learn from data.
Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.
Arthur Samuel, 1959
Here’s a slightly more engineering definition of Machine Learning:
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
Tom Mitchell, 1997
For example, your spam filter is a Machine Learning software that can learn to automatically flag spam emails given examples of spam emails (flagged by users) and examples of regular (non-spam) emails. The example regular non-spam emails that the system uses to learn are called the training set. Each training example is called a training instance (or sample). In this case, the task T is to flag spam for new emails, the experience E is the training data, and the performance measure P needs to be defined; for example, you can use the ratio of correctly classified emails. This particular performance measure is called accuracy and it is often used in classification tasks.
Why Use Machine Learning?
For an introduction to machine learning, Let’s Consider a simple application such as the case where we have to build a spam filter without Machine Learning, meaning we’ll be using traditional programming techniques
- First, we’ll have to take a close look at how spam emails look like, then decide on what patterns or words are usually repeated in these emails. For example you notice that the words (“credit card”, “for free” and “offer”) tend to show up in the subject.
- Then you would write a software algorithm to detect each of these patterns that you noted, and If a number of these pattern is detected for a given email it would be filtered as spam.
- Now you would test your program, maybe repeat steps 1&2 until convergence.
Since this problem is quite complex, this program will most likely consist of a long list of complex rules that will be hard to maintain.
ML-Based Programs
On the other hand, If our program is based on Machine Learning techniques that automatically learns which words, phrases and patterns are good predictors of spam emails compared to non-spam emails, it will be much shorter, easier to maintain and most likely a lot more accurate.
Traditional programming techniques (rule-based) may be circumvented by spammers who notice that their emails containing the phrase “For free” are automatically filtered to spam. They may simply change it to “4 free” to evade the filter. This is an active voice sentence.. And in order to maintain your software you’d have to update your rules to include “4 free” emails as well. This means that you will have to update the software frequently to circumvent new methods and patterns. This is an active voice sentence.
In contrast, a spam filter based on Machine Learning can automatically detect that “4 free” has become unusually frequent in spam flagged emails, and will start flagging them automatically without no intervention from you.
Where does Machine Learning shine?
Machine Learning also shines in any area where problems have no known algorithms or are too complex for traditional programming. As an example, consider speech recognition engines, if we want to write a program that can distinguish the words “Time” and “Space”, you might notice that the word “Time” starts with high-pitch sound (“T”). So you could hardcode your software to measure and detect high-pitch sound intensity and use that to distinguish the two words. Obviously this approach will not scale and fail miserably to thousands of words spoken by millions with different accents and in noisy environments.
The best approach today is to write an algorithm that learns by itself and keeps updating and adapting to change given many example recordings of each word.
Finally, working on Machine Learning applications can help you (as a human) learn. If we consider the spam filter application, Inspecting the ML algorithm will reveal the list of words and phrases that it believes are the predictors of spam. Sometimes this inspection will reveal unsuspected correlations that we as human being may very easily overlook. This process will likely lead to better understanding of the problem at hand.
To summarize this introduction to Machine Learning, We should use ML for:
- Problems for which existing solutions require a lot of revesting or long lists of rules: one Machine Learning algorithm can often simplify code and perform better with much better maintainability.
- Complex problems for which there is no good solution at all using a traditional approach: the best Machine Learning techniques can find a solution.
- Fluctuating environments: a Machine Learning system can adapt to new data.
- Getting insights about complex problems and large amounts of data