Spread the love

What is Machine Learning?

As an informal definition according to Arthur Samuel machine learning is;

The field of study that gives computers the ability to learn without being explicitly programmed.

As a modern definition Tom Mitchell describes

Machine learning as a computer program is said to learn from experience E with respect to some class of tasks and performance measure P, if its performance at tasks in T, as measured by P, with experience E.

Let’s give explain it with playing checkers example;

  • E is stands for the experience of playing many games of checkers
  • T is stands for the task of playing checkers.
  • P is stands for the probability that the program will win the next game.

Supervised learning defined on wikipedia as the machine learning task of inferring a function from labeled training data.

Let’s explain it with an example;

Let’s say you want to predict house prices. You collected some sort of data with the price and size attributes, and you ended up with a chart like this;

Screen Shot 2015-10-07 at 9.35.21 PM

So how can machine learning help you with that?

Given this data let’s say you have a friend who owns a house 750 sqft and you want to learn the optimum sale price of this house. One way to do it would be putting a straight line into the chart like this;

Screen Shot 2015-10-08 at 12.05.15 PM

In this case the house may be sold for around $150k.

Also you can implement a different learning algorithm using a polynomial(quadratic function) instead of a straight line. Let’s see this in action as well;

Screen Shot 2015-10-08 at 11.59.14 AM

And now according to our new funtion the price of the house is around $200k. In the following posts we will discuss which function is better in what kind of situations. However for now each of these are really good learning algorithms.

In this example we had a dataset which has right size and price attributes, and the purpose of this algorithm was to produce more right answers according to the given size and price attributes. This is also called as regression problem. In our example regression problem is predicting the continues valued output which is the price attribute.

We also have another type called classification problem. In classification problems you basically classify examples into given set of categories. The main purpose of classification problems is predicting highly accurate results on a given test data.

A real world example would be hacked Adobe accounts. Let’s say you want to create a software to examine customer accounts, and for each account decide if any information is leaked or not.

We can set the values that we want to predict to 0 and 1. 0 means that the account has not been hacked. 1 means that the account is hacked. Now we may have an algorithm to predict each of these discrete values. And because there are small number of discrete values we can treat this problem as a classification problem.

Regression problem is creating output variables that take continues values.

Classification problem  creates output variables that has labels. 

Unsupervised learning is a method to approach problems with no or a little idea about what the result should look like. That means when we start working on a problem given with a bunch of data, we are not able to completely predict the result. In unsupervised learning we can create some structures from given data where we don’t know the effects of variables or attributes. Unsupervised learning is not just about clustering. Instead there is no feedback based on the prediction results. That means result is mostly unknown.

As an example  suppose a doctor over years of experience forms associations in his mind between patient characteristics and illnesses that they have. If a new patient shows up then based on this patient’s characteristics such as symptoms, family medical history, physical attributes, mental outlook, etc the doctor associates possible illness or illnesses based on what the doctor has seen before with similar patients. This is not the same as rule based reasoning as in expert systems. In this case we would like to estimate a mapping function from patient characteristics into illnesses.