What is Machine Learning and Why it Matters?

Any conversation that includes these modern technological buzz words related to machine learning, sees people swapping phrases interchangeably with each other. But when you need to answer the following questions, you really need to have some understanding of what these terms mean.

Is machine learning and AI the same thing? What is the difference between machine learning and deep learning? What does big data have to do with all of this? What is Python? Does machine learning mean self-awareness? And most importantly, does this mean I am going to lose my job to a machine?

Let’s unpack some of these terms and then start to work out what machine learning means for us in a practical sense.

Let’s try some definitions

Machine learning is an application of artificial intelligence (AI) that allows machines to have access to information and data so that they can learn and improve from experience. The learning part means that they do this without being explicitly programmed for each new task. Machine learning focuses on the development of computer programs that can access data and then use the data to learn for themselves.

Artificial intelligence is the intelligence displayed by computer systems performing tasks usually done by human intelligence. This includes things like speech recognition and language translation, pattern recognition, visual perception, and decision making.

Deep learning refers to a computer’s ability to learn more about a particular task or subject, rather than just become more efficient at one specific task or set of outcomes. That is what allows the machine to learn to recognize facial features or do speech recognition.

Big data is what you get when you combine all the data points that we generate with our actions, transactions, and movements, along with other data that is collected about natural phenomenon. With many millions upon millions of data points, it is impossible for humans to process the data into anything meaningful, but through computational algorithms, analysis can be done to reveal patterns, trends, and associations. This information is then used for rational decision-making.

Big data graph over the years growth

Data mining is simply the process of apply various algorithms to a large set of data in order to extract some meaningful information.

Python is an open source computer programming language, often used to design machine learning programs. Open source means that it is commercially free to use. Other computer languages used for machine learning include Java, R, C++, and Scala.

So what is an algorithm? An algorithm is a set of rules or a defined process to be used to solve problems or calculations. Algorithms are used by people when there is a particular and recurring decision process to be used, but it is best thought of in terms of math rules or computer decision-making hierarchies.

Machine learning algorithms would be those initial sets of rules set up to guide a computer to address a particular problem. When deep learning takes place, the computer is learning beyond the initially defined algorithm.

Machine learning algorithm with equations

Nobel Prize Winner in Economics, Herbert Simon, said that “Learning is any process by which a system improves performance from experience. Machine learning is concerned with computer programs that automatically improve their performance with experience.” So the same learning process that we experience as humans, is what machines are doing when the right systems are in place.

The process

Machine learning algorithms have three parts:

  • The Output
  • The Objective function
  • The Input

For example, we use machine learning to identify spammy emails.

The Output is a categorization of certain emails as spam and others as legitimate.

The Objective Function is to separate a high percentage of correctly classified emails.

And the Input is the huge database of emails that have been processed and assigned and reassigned as either spam or not.

It is worth remembering these elements of the machine learning algorithm because it is such a fundamental approach to problem-solving. We want a result. We work out what a good result looks like. And then we work through huge amounts of examples to learn what looks like the result and what does not. That means that there is no secret to machine learning, nothing that is suspicious. The machine is just learning.

We can see the flow of machine learning illustrated below.

Machine learning work flow

Machine Learning versus the Experts

No, machine learning does not necessarily mean that we do not need experts in various fields. Initially, an expert, with their training and learning, experience and insight, can analyze individual situations and work out why something is happening.

Programmers programming at work

Programmers take that expert insight, devise an algorithm and apply it to the large data set. Results are provided and those results are then refined to get better, more accurate results with the next round. Machine learning means that the algorithm uses these initial settings and compares its results to what was expected by the human experts, in order to produce a better set of results the next time around.

The system requires both the initial expert and the programmer who can get the machine to learn.

What we will see happen is that people with insufficient expertise and less creativity will find their roles taken over by efficient and effective machine learning programs. We do not need merely skilled workers to process routine and commercially unprofitable legal questions, human resources procedures, and financial transactions. Where machine learning or artificial intelligence, can make more decisions, more accurately and efficiently and at a much lower cost than human intervention, it will make sense to replace those lower end workers with computers.

Compare the human brain to the machine

When we have to learn an algorithm or classification or work out how to describe it, it is relatively easy for us. We learn some of these classifications before we even have verbal words as infants. But it is very difficult for a computer to learn how to differently classify objects. Their learning is slower and requires huge amounts of data so that they can work out what belongs or does not belong to a set.

On the other side of the equation, once the machine has learned how to classify data, it is so much more efficient than a human brain at implementing that learning and automating that process. Just think of how many millions of results a simple Google search can deliver in a few seconds. That is beyond the scope of any number of human beings.

Why is machine learning a topic of interest?

Why has machine learning erupted now as a subject? Well mostly because of the huge volumes of data that we create and have access to. It’s impossible for a human to process that data into anything meaningful, so machine learning is applied to come up with some valid results.

Let’s have a quick look at the Data part of Big Data. The traditional mental model to consider these elements is referred to as the DIKW pyramid. It refers to Data, Information, Knowledge, and Wisdom.  This model has been expanded on to include some more elements and is probably best not to be thought of as a pyramid, but rather steps in deeper understanding.

#1 Noise

All of our devices such as smartphones, GPS on our cars, health apps and wearable’s, computers, etc., all produce noise. They signal our position, speed, heart rates and more.  Knowing about the noise is relatively useless. We might be able to identify one person’s position via GPS or that person’s heart rate while running, which is useful for them, but in a broader sense and after the moment has passed, it is relatively worthless information.

#2 Data

That noise in sets makes up data. A man’s heart rate over time is useful data for his doctor, and a vehicle’s GPS coordinates over time means that a company knows how their vehicle assets are being used.

Still, it is next to impossible for a human to manually process these noisy data points and data sets.

#3 Information

Put together, in patterns, data becomes information. We can tell the health of a person in conjunction with other data sets and in comparison to other men of his same age, weight, and height. We can work out the fuel costs, wasted time routes and efficiencies of an entire fleet of vehicles.

#4 Knowledge

That information, combined with other information, experience and training, becomes useful knowledge that medical health professionals can build insurance actuarial tables on, or logistics teams can redesign routes for deliveries.

#5 Wisdom and Vision

This is the top level, one where the quantitative data is mixed up with a little gut knowledge, some intuition and creativity, and makes for some insightful and visionary business decisions. This is definitely the human end of the equation, way past machine learning, and, it must be clear, it is not infallible. But when it is right, it is the best decision for the company or industry, and business school case studies are taught on it for years thereafter. In fact, business school case studies are often written on the errors of human intuition as well.

#6 Computational power

The increase in computational power that we see exploding in the tech world, is another good reason for the subject of machine learning to be gaining in popularity. It was simply impossible for computers to be able to process as much information in the past, as they do these days.  One frequent example of this is that your latest smartphone has more computational power than all the computers at NASA did when they sent people to the moon.

One of the explanations for this increase in computational power and its implication for machine learning is called Moore’s Law. In a 1965 paper, Gordon Moore, co-founder of Fairchild Semiconductor and Intel, observed that the number of transistors in integrated circuits doubles every two years. That means that the physical building blocks of the units actually doing the processing, get smaller, more efficient, produce less friction and are therefore faster. At a rate of doubling every two years, that means exponential growth. And Moore didn’t even factor in quantum computing which is a speed of computing that is phenomenally faster.

#7 Industry involvement

Very often when new advances are made in technology, it takes a while before industries work out how to use them. As more industries started experimenting with algorithms based on their needs and the information that they wanted to generate, the algorithms became more sophisticated, more accessible to more players and more widely adopted. The successful applications of these studies meant that even more industries wanted to see what results they could generate and adoption became more widespread.

Chip that powers machine learning

Where do we see applications of machine learning?

Banking, telecoms, and retail are prime and easy examples of good application of machine learning. Algorithms can identify prospective customers based on past behavior. They can identify good and bad payers and high credit risks. They can reduce instances of fraud. Well-designed machine learning can also reduce churn of customers and keep the profitability of the company high. One algorithm designed by a banker in the UK can even identify potential terrorist activity based on the banking behaviors of clients. Needless to say, this particular algorithm has not been made public but has resulted in some preemptive arrests.

In medicine, machine learning can screen, diagnose and give a prognosis for illness and recovery. One accidental discovery revealed that certain health-related search results by middle-aged men had a very high correlation with them discovering within a few weeks, that they had prostate cancer.

Security is also a key machine learning success story. Beyond the dramatic facial recognition and tracking of ‘individuals of interest’ that we see in movies, DNA, signature and other biometric data can all be mined for eliminating or identifying security risks.

And then, of course, the internet and computer world is based on machine learning with troubleshooting wizards, handwriting and speech analysis and conversion to text, spam filtering, recommendations and search results, and so much more. These are all evidence of our daily reliance on machine learning.

A few points to remember when developing a machine learning approach

If your data set is small, it is still worth using old-fashioned models to make decisions. Data mining and machine learning are designed for the largest volumes of data points or else you are wasting your time developing a process.

Your results are only as good as your data. Poorly recorded, incomplete or biased data will give you bad results and you will use that to make inaccurate decisions.

An example of how biased data can affect results is in the healthcare industry.  After the tragedy of the birth of babies without limbs when women took a drug called Thalidomide to prevent nausea, the United States FDA (Food and Drug Administration) made it illegal to test drugs on women during pregnancy. There was no way of knowing how a new drug might negatively impact a fetus. To make their lives easier and to avoid any potentially pregnant woman, drug companies used fewer and fewer women in their testing programs. That meant that the impact of drugs tested on men, with their physique and size, on many occasions does not reflect the same result in women who have a different metabolism and physical size. Many millions of data points collected on results in men, and fed into a machine learning algorithm, will produce no useful results for women if they are not featured in the data set.

Other data quality issues means that the humans need to spend time cleansing the data before feeding it to the computers. Poor data, anomalies, and other errors often creep into data sets and they need to be eliminated before confusing the computer.

Coder programming for machines

Errors in results are likely but probably based on the poor application or transmission of data into the machine learning algorithm, bias in data collection, a poorly defined problem set at the start of the exercise or some other human error. All results should be tested and skepticism of results is a healthy approach. If the problem set is not well defined, and the data set is poorly selected, your machine learning results could be a self-fulfilling prophecy.

Machine learning can make some huge progress in analyzing data and producing results, but that does not automatically convert into changes in an industry or fixing fundamental problems. Changes in policy, procedure, legislation, legacy systems and more, will merge meaningful results with visionary futures.

Some examples of companies using machine learning

It should come as no surprise when referring to some of the most popular online social media networks, that they rely heavily on machine learning for some of the processing of the millions of bits of content and interactions uploaded to them daily by their users.

Yelp is a site that allows users to rate and share reviews of restaurants and other entertainment places. The reviewer’s content reviews are useful for new guests to decide whether or not to visit, but the reviewers also upload images of the restaurants. These images need to be sorted, categorized and labeled, and with the huge quantity of images to process, it makes sense to use machine learning for the categorization.

Chatbots on sites like Facebook and customer service interfaces on everything from banking to university sites, are a very useful feature for the companies and serve up a decent experience for the user too. Initially, a chatbot will be fed with the initial queries that users will pose. The algorithm in the text chat will allow the bot to point the user in the right direction, offer up the correct URL or form to complete, process certain instructions, and request necessary information or documentation. At a certain point, the chatbot will encounter a question that it cannot satisfy. Just like a junior staff member, it will then escalate the situation to a human who has that insight, but during that correspondence, the chatbot will learn the more appropriate response or new solution and incorporate that into its repertoire of responses. Next time it encounters that problem, it has learned how to respond. Machine learning at its best. Chatbots can eliminate the need for human engagement and streamline the bulk of responses that are most frequent, allowing humans to engage in the more intimate of conversations.

News feeds on Twitter, in particular, are designed by machine learning. Algorithms based on the user’s apparent interests and the likelihood of a post fitting that taste is only part of what pushes particular posts to the top of our newsfeeds. The popularity of the poster and the immediate engagement of that post by other viewers also plays a part.

Pinterest, the popular image curation platform, actually purchased a machine learning company so that they could integrate the tool into their site. Pinterest uses machine learning to monetize advertising, reduce spam, moderate content, and produce more meaningful and engaging newsletters.

User on ipad checking google

Of course, search engines like Google and Baidu (a Chinese search engine) incorporate machine learning in their search algorithms, but in other areas as well. Baidu is of interest because of their investment into language and speech recognition and the generation of genuine-sounding human voices. That must be particularly challenging in the subtle pronunciation environment of the Chinese language. But the generation of realistic voice means that the machine has learned how to distinguish the cadence and pronunciation and other subtleties of speech. It means huge strides for voice search applications and points to a time where we will interface with the internet via spoken instructions instead of typed text. Of course, we can already do that with tools like Siri and Alexa, but this level of machine learning means that we will be able to do it in a much more sophisticated way in the future.

There you have it

Imagine a time where you know that data has been collected on your topic and sits in a huge database. You will be able to speak to your computer, tell it to access that data and give you a historical perspective of the last decade based on 2 or 3 criteria of interest to you, and then to extrapolate that data into the future for 1, 2 or 5 years. Then within seconds, you will have data tables, graphs, and analysis in front of you to work with or to present to an audience. The layers of machine learning in that exercise are already being developed and this scenario is not science fiction, it is only waiting for a time to be commercially viable.

We will be happy to hear your thoughts

Leave a reply