### Machine Learning - Types of Learning - Classification Explained

Welcome back to yet another post on the Machine Learning - Types of learning. In the last post we read about Regression type of learning and its various methodologies. If you did not come across that post yet please click on this link to read that. After reading that, this interesting post will seem more interesting ðŸ˜Ž

For the people who did not read the previous post short recap: we are seeing each and every type of learning methodology in machine learning that makes the machines to learn.

Following types of learning procedures can be identified,

- Regression - Visited last week - click here to read this
- Classification - This is our topic today
- Clustering
- Rule association
- Ensemble
- Reinforcement
- Deep Learning

Well let's dive deep into the classification.

### Classification:

Before we see how the classification works, Let me tell you in english, in general what does this classification means !. Wikipedia says that classification is the process of categorizing anything in the world to some category. For example. We see tiger and lion and we say they belong to cat family. We see a car on the road and we say that "Hey it's Jaguar !!"

So the basic question here is why do we do classification, as a human ? Human brain is capable of classifying the things that we see to some known categories. This helps us to identify the things faster in later time. We use this mechanism to remember our friends, relatives or stranger's faces.

### Human way of classification:

So how a baby start to learn classifying the things that it see in the world, in the due course of its growth. The experience or knowledge from the parents are passed on to the children in the name of teaching process. Yes we teach the children - this is fire, this is water, this is milk, this is a liquid, this is a solid. A solid will be having the characteristics of being strong, hard etc. Like this we give the child, our knowledge and experience. Later when the child see a new object it will try to find whether it is a solid or liquid based on its characteristics.

Fine, you might be thinking why does this guy tell about the human way, this we already know. Yes we already know about this, but I need to recall this as we are going to relate this to the computer / machine way of learning.

### Machine way of classification:

Imagine the machine like a baby, we humans need to teach the knowledge and methods to classify the objects that it come across in any form. Now we use the past data that we collected from the real world and is fed to the machine. We tell the machine to use some mechanism helping it to choose the right curve or method to segregate the data.

There are various methods to do this but in general we call following types of broad categories as linear classification, non-linear classification, distance based classification, rule based and probability based techniques. Don't worry we will understand this soon below.

Just for fun :-p |

### Linear classification.

Well, imagine that we have some historical data for example:- let's assume that doctors conducted an experiment on various attributes and relationships of cancer and causes like alcohol, smoking etc. This data has the label that they have cancer or not. When the data is in the form of a table we cannot understand what the data contains in it. Let's try to visualise the same data in terms of a graph taking only two reasons alcohol and smoking.

In this we can see that green dots representing that when the smoking and alcohol habit is going high the chances of cancer are high. Okay what can we do with this data now. How this will be helpful if I teach the machine to do the classification on this. I mean what could be the value that I will get? Imagine that the doctor feed-in the details of the patient on how much he drink or smoke, with that values it will alert the doctor to go for the early diagnosis of the cancer.

Fine with visual data we can see that there are two groups yes or no groups and are separated from each other. Now with this data as a human I can draw the line in between to separate them as below.

When I draw a line in between to separate the groups then we call this method as Linear classification. Any linear equation based separation like line or plane etc comes under the linear models of classification.

Just for fun :-p |

### Non - Linear classification model:

By this time you would have guessed that the non-linear models will be using the non linear type of equations like the circle, parabola, eclipse, sin, cos curves to separate the categories as below.

### Distance Based Classification:

Wondering what might be this ? :-) Fine assume that we plotted the two categories on the graph. This is the knowledge we have now in the hand. If any new data comes in, we are going to find the distance of the new coordinates with the nearest category and we assign the point to that category. The distance can be calculated in any methods, either the distance is calculated between all the nearby points to the new point or by calculating the distance between the center point of the category to the new point, or by drawing a circle round the categories and the distance of the new point from the circle circumference also will work out.

When we say distance, there are many people in the field of mathematics created a lot of distances like,

- Euclidean distance
- Manhattan distance
- Mahalanobis distance
- Hamming distance .. etc

Just for fun :-p |

This may seem like a greek and latin as of now, as we go on with various blog posts in this series we will be clear on those. As of now just think that they are the methods to calculate the distance between the two data points. If you ask me why do we have this much of distances, the answer is - not all methods are suited in all the situations, so we go for a trial and error approach between them to find which gives a good result.

### Rule based :

This method is like our if else that we write in the code. But here rather than writing the explicit if and else we identify lots of decision boundary rules (Means the data value from the past that shows a significant change in category of a new point) from the existing data. For example if the patient have smoking habit between this number to this number, then if he has alcohol too then he might be a cancer vulnerable. Mainly we call this approach as Decision tree approach.

### Probability based:

In mathematics, that one field that we all hated in our childhood days is the probability. Probability means and helps us to identify the chances of an event to occur. In our case, the chances of a patient being a cancer vulnerable is the event provided that he is a smoker or a drinker. This method of finding the probability based on a certain condition like smoking or drinking is called the conditional probability. Under conditional probability we have a famous formula explained by Bayes and its called as Bayes theorem. This concept we use it to find the category of the classification problems based on the probability. Let me explain you these methods in my next post after covering the basics of mathematics else it would be of no use.

So based on these above categories of methodologies , below are the famous methodologies exist to teach the machine to classify,

- Logistic Regression based classification - Linear Model
- K-Nearest-Neighbors classification technique- Distance based classification approach
- Support Vector Machine (SVM) - Linear model based approach.
- Kernel SVM - Classification technique to plot non linear onto the linear and separate them.
- Naive Bayes - Probability based technique
- Decision Tree classification - Rule based technique

So readers I guess by this time you would have got the very good idea on the classification based learning of the machine learning field. I will be explaining the various methodologies (Algorithms) and the categories in detail, in the various future posts, so follow my blog, share with friends (Sharing is caring ðŸ˜‡) and keep connected, until then Thanks for reading !!.

## Comments

## Post a comment