4 Mins Read

What is Loss Function in Machine Learning


What is Loss Function in Machine Learning

The majority of machine learning algorithms work by minimizing or maximizing an ‘objective function‘. Loss Functions are a group of objective functions that are supposed to be minimized. These functions are sometimes referred to as “cost functions” in artificial intelligence. Using the loss function, we can evaluate the ability of the model to predict new values.

Loss Functions in Machine Learning

There are two main types of loss functions in machine learning:

  • Classification loss functions
  • Regression loss functions
loss functions in machine learning
Figure1. Overview of different loss functions in machine learning

Loss Function in Regression Models

In this section, we discuss the loss functions used in regression models.

MSE (Means Square Error)

MSE is one of the most famous and commonly used loss functions in regression analysis. For this loss function, the mean square of the difference between the predicted and actual values is calculated:

Loss Function
Figure2. Overview of MSE Loss Function

MSE uses the squared values to create a loss functions. Therefore, the loss function is parabolic for prediction (or error) values. Its advantages are being easy to understand and having only one local minimum. its disadvantages include not being robust to outliers.

Mean Absolute Error (MAE)

Mean Absolute Error, or MAE, is another loss function with interesting properties. This loss function, like MSE, uses the difference between the predicted and actual value as a criterion but does not take into account the direction of the difference. Therefore, MAE calculates the mean absolute value of the differences between the predicted and actual values. MAE advantages include its intuitiveness, ease of use, and robustness to outliers. Its main disadvantage is that we can not directly use gradient descent, and have to rely on sub-gradient calculations.

 Loss Function
Figure3. Overview of MAE Loss Function

Huber Loss

Huber loss function is less affected by outliers than MSE. Also, unlike MAE loss function, minimization is easily achievable. Since different values of δ in the Huber loss equation change the shape of the loss function, choosing the right value for it is a sensitive and difficult task. If δ is too large, this loss function will be the same as the MSE loss function. If δ approaches zero, the Huber loss function approaches the MAE loss function.

Loss Function
Figure4. Overview of Huber Loss Function

Loss Function in Classification Models

Binary Cross Entropy

In binary classification problems we are working with two classes. In binary cross-entropy, each of the predicted probabilities is compared to the actual class output, which can be either 0 or 1. A score is then calculated based on the distance from the expected value that penalizes the probabilities and shows how close or far the predictions are from actual values.

loss Function
Figure5. Overview of binary cross-entropy loss Function

Categorical Cross Entropy

The categorical cross entropy is used in multi-class classification problems, as well as softmax regression. It can be calculated as follows:

loss Function
Figure6. Overview of binary cross-entropy loss Function


Bishop, Christopher M. Neural networks for pattern recognition. Oxford university press, 1995.

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” nature 521.7553 (2015): 436-444.

Related articles
To train a model or use transfer learning in machine vision, there must be enough data. Data Augmentation is...
Train, Test, and Validation Datasets
An artificial intelligence model output is affected by how we divide the input dataset. There are several factors to...
Data-Driven approach
An AI model’s performance can be increased by either improving the dataset or the model’s structure. The purpose of...
In this article, we will introduce Tensorboard and explain how it can be used on AIEX....
Backbone is a network that extracts a feature map of the input image , the map is then utilized...
evaluation metrics
This article examines the different metrics used to evaluate machine vision models, and the metrics implemented on the AIEX...
Subscribe to our newsletter and get the latest practical content.

You can enter your email address and subscribe to our newsletter and get the latest practical content. You can enter your email address and subscribe to our newsletter.