Supervised Learning: In-Depth

Saurav Kumar
4 min readJul 6, 2020

In supervised learning, the model is fed with data (which is labelled) that is used for training as well as for testing, then a new set of data is given and the model analyses the training data and produce a correct outcome from labelled data, using the Machine Learning algorithms.

As the name suggests there is a supervisor or teacher who trains the model.

This is task-driven.

The working of Supervised learning is as follows

  1. Give Instructions(Training)
  2. Give Assignments(Testing)
  3. Finishing up the Assignments
  4. Evaluate the submission( Error evolution)

How it is Done?

1.We pick up the data and select the algorithm according to the problem statement and train the machine.

2. The work of the model is to predict the labels (Dependent variables) from given Features( Independent variables).

3. Based on output Calculate accuracy using the algorithm. Basically when the difference between training accuracy and the testing accuracy is minimum that model is preferred. In order to know the minimum difference between the accuracies graph is plotted between training and testing accuracy.

4.In case the accuracy is not good then we should do certain changes :

(a) Change in the dataset(that is in training and testing size)

(b)Increase number of sample for training

©Change in algorithms

The Machine Learning model that is created in supervised learning can be of 2 types

  1. Classification Based model
  2. Regression-Based model

If the model has to predict a categorically based prediction then Classification base model is used.

If the model has to predict a Numerical value then Regression-based model is used.

for example: Consider a business, If you want to know, is it profit or loss? you can use classification based model but if you want to know about how many sales(you did in past or going to do in future)? then you have to use regression-based model.

Similarly, other examples are :

Supervised learning always need

  1. Fearures ( Independent Varibles )
  2. Labels( Dependent Variables)

Challenges:

Irrelevant input features during training could give inaccurate results.

Accuracy suffers when impossible, unlikely and incomplete values have been inputted as training data.

An example of a classification-based model of Supervised learning using an iris data set is done as follows :

In[1]: called the data

In[2]: defined Features( Independent Variable)

In[3] : defined Labels( Dependent Varinble)

In[4]: Check how many labels are there

In[5]: mapped the different labels

In[6]: Imported train_test_split(this will split the data set for training and testing)

In[7]: Counted the number of training data

In[8]: Counted the number of testing data

In[9]: Imported a classifier based model ( KNeighborClassifier)

In[10]: Defined the model

In[11]: Trained the model

In[12]: Predicted a label (got 0 which is mapped to Iris-setosa)

In[13]: Calculated training accuracy

In[15]: Calculated testing accuracy

In[17]: Created a loop for the model that trains, tests and calculate accuracy

In[19]: plotted a graph between train accuracy and test accuracy.

Hope you got at least a little bit of a grasp on how to approach a supervised machine learning problem after finishing all the steps above. If you have any doubts, suggestions or corrections do mention them in the comment section and if you like this article show a bit of appreciation by sharing it with others and by following me.

Happy coding😊

--

--

Saurav Kumar

Writing and expressing is the best way to live life......