Error Metrics in Machine Learning

Krishna
7 min readJan 29, 2022

--

We humans never can be 100 % perfect, how much ever we try we can end up being maximum up to 99% good but never 100 %. If we humans are not 100 % perfect, how could a machine be? But the interesting part is that they can be better than a human if they are trained perfectly because humans cannot do the same thing repeatedly, but a machine can do it never gets bored of doing the same thing again and again, so as it gets trained each time it learns a new element and stores it in its memory. In general, this is what machine learning means we try to train the machine with algorithms from which it learns and tries to remember the hidden patterns. But how could we say the machine is doing its job correctly ?, here Error metrics are used to find out how well the machine is doing.

Error: In general, an error is a mistake if we elaborate it from an ML perspective, an error is the misclassifications or wrong predictions with respect to actual classifications or predictions.

ML algorithms can be used either for prediction or classifications and few can be used for both, in all cases each time when the machine is trained over the training data an error metric should be used to calculate the error which would give us the information about how well the machine is performing its job. For regression tasks, we calculate the mean square error (MSE), mean absolute error (MAE), and root mean square error (RMSE) and there are few more but in regular we use these metrics to measure the errors. For classification, we use accuracies, precision, recall, and f1 score. What do all these mean ?, how are we going to implement and what can we get from them ? we shall discuss.

Predictions

Consider an example of house sale prices, if we are asked to predict what could be the upcoming year’s sale prices we first need the previous year’s data and then we would split the data into training and testing, we then train the model with the training data. Now, it is very important to understand how well our model has performed to assess our model we would use some error metrics.

  1. Mean Square Error (MSE)
  2. Root Mean square error (RMSE)
  3. Mean Absolute square error (MAE)

These errors will provide us the information regarding the performance of our model, if the error is more, then the model is bad and needed to be tuned and if the error is small, then the model is performing well. Let’s understand how do these errors work.

Mean Square Error (MSE):

It is also known as cost function, often we come across this word when we are tuning the parameters. Mean square error is very easy to understand and even implement, it tells us how close a regression line is to a set of data points, and then it calculates the difference between each data point and the regression line and squares the difference. Statistically, it is represented as below,

MSE formula = (1/n) * Σ(actual — predicted)**2

Actual = Testing or unseen data by the model.
Predicted = Points that are predicted by the model using training data.
n = Total number of data points

Squaring the difference would remove the negative values, for instance, the difference could be ‘-2’ so if we square it then the negative sign would be removed. But it comes with a cost, if you observe when we are squaring the -ve number turns into positive but the number is also doubled which could be a problem, so to overcome this there is another metric which is the root mean square error.

Root Mean square error (RMSE):

Once we calculate the MSE and take the square root of the value that gives us the RMSE, the MSE will eliminate the negative values and RMSE will punish the doubled values by taking the square root.

RMSE = sqrt((1/n) * Σ(actual — predicted)**2)

Actual = Testing or unseen data by the model.

Predicted = Points that are predicted by the model using training data.

n = Total number of data points

Mean Absolute Error (MAE):

Will try to understand what is absolute error, it is the difference between the calculated value and actual value. For instance, if you are weighing something on a weighing machine and it weighed the weight as 100 kg but you know the actual weight of the item is 98 kg, the difference between them gives us the absolute error.

AE = |100–98|,

AE is 2 in this case.

In general AE = | x1 — X |, (Note: the modulus is to avoid negative values).

The Mean Absolute Error(MAE) is the average of all absolute errors,

MAE = (1/n) * Σ(| x1 — X |)

X1 = the measured value,

X = Actual value, (In some places X is represented in lower case too)

n = Total number of data points

MAE measures accuracy for continuous variables.

We might get a doubt about which metric is the best ?, well in general MSE is better and widely used because it punishes the error by taking the square of the difference while MAE doesn’t.

I think this gave an intuition about what are the widely used prediction metrics to calculate the error in solving regression problems, MAE, MSE, and RMSE. Now let’s see how to know whether a classification algorithm is performing well or not.

Classification

In these times of corona outbreak it is very important to test whether a person has the virus or not, for this we use different tools(machines) which are trained with the set of symptoms and data of the people who have these symptoms, depending on the training data the machine is going to classify whether a person has a virus or not. To tell whether the model used by the tool has classified the patients correctly we use some metrics like Accuracy, precision, recall, and f1 score.

Accuracy:

It is a measurement to see how the model performed, generally measured as the sum of True positives and True negatives divided by the sum of True positives, Tue Negatives, False Positives, and False negatives.

Accuracy = (TP + TN) / (TP + TN + FP + FN).

This will tell us how well the model is classifying the data, depending on this measure we can improve our performance.

True Negatives:

When our model correctly predicted the negative class, for instance, consider the example of covid classification, if the model inferred a person tested Negative where he/she is negative, then the will be a True Negative.

True Positives:

When our model correctly predicted the positive class, for instance, consider the example of covid classification, if the model inferred a person tested positive where he/she is positive, then the will be a True Positive.

Whenever we use a classification problem we plot the confusion matrix, which will compare the actual classifications vs the predicted classifications in an NxN matrix.

Confusion Matrix:

It is an N x N matrix, it will tell us how successfully our model is classifying the data into different classes. It has in total four attributes, True Positives, True Negatives, False Positives, and False Negatives. One axis is our predicted labels and the other is the actual labels.

Precision:

It tells how well our model was correct when predicting the positive class, positive class can be understood as True positives.

Precision = TruePositives / (TruePositives + FalsePositives)

Recall:

It is a measure to know whether our model is correctly identifying the True Positives. Precision provides only the true positive predictions out of all positive predictions, recall indicates missed positive predictions too.

Recall = TruePositives / (TruePositives + FalseNegatives)

All the above metrics range from 0 to 1 or 0% to 100 %, whenever we use a classification model we use all these metrics to find out the performance of our model.

f1 Score:

It is measured as,

f1-score = 2*((precision*recall)/(precision+recall))

It conveys the balance between the precision and the recall.

Conclusion :

These accuracies will help by letting us know how many people are truly positive or negative for instance, if a person has some symptoms related to the virus and he/she decided to take a test, the model had predicted that the person is negative but in reality the person is positive, imagine what could be the situation, how many people would be affected by just believing in a false report. To avoid all such types of outbreaks, the machine which is used to test viruses in people should be trained perfectly. In the same manner, the error metrics for regression problems help us to understand, what could be the sales tomorrow. But we need to also consider a few things, environment, pandemics epidemics, etc, these are a few things that the machine cannot understand, its work is to predict what could happen tomorrow based on yesterday’s data. For instance, the model is trained with advertisement data to predict tomorrow’s sales, if tomorrow there is a natural calamity and the complete sales got disrupted, then it is not the fault of the machine, it cannot handle that kind of situation. So, depending too much on machines and believing them blindly could also lead to problems.

--

--

Krishna
Krishna

Written by Krishna

Machine learning | Statistics | Neural Networks | data Visualisation, Data science aspirant, fictional stories, sharing my knowledge through blogs.

No responses yet