3 Mins Read

Various Types of Regularization


Various Types of Regularization

Machine Learning and Deep Learning models use regularization techniques to prevent overfitting. Before delving into the regularization methods, it’s crucial to understand the concept of overfitting.

To illustrate, imagine you are preparing for a final exam and your professor has provided you with 100 sample questions. If your preparation is limited to memorizing these 100 questions and you are unable to answer questions that differ slightly from these 100 sample questions, your understanding of the material would be considered overfitted. In the world of algorithms, overfitting occurs when an algorithm is only able to accurately predict the data it has learned from the training set but is unable to accurately predict and classify new data that deviates from the training set. This can be visualized by a graph line that tries to fit the data as accurately as possible but is not suitable for real-world scenarios.

To avoid overfitting, “noise” is added to the model through regularization techniques such as L1, L2, and Dropout. These methods are commonly used in Deep Learning models, such as Artificial Neural Networks (ANNs).

 Ideal balance
Figure1. Overview of Underfitting, Overfitting, and Ideal balance.

L1 and L2 Regularization

While L1 and L2 are different mathematically, they both serve the purpose of solving the overfitting problem.

L1 regularization adds an L1 penalty, equivalent to the absolute value of the magnitude of the coefficient, which serves to restrict the size of the coefficient. One of the methods that implement this method is Lasso regression. As a result of using L1, the weight value tends to zero. In L1 regularization, regression coefficients are determined by minimizing the L1 loss function as follows:


L2 regularization, on the other hand, adds an L2 penalty that is proportional to the square of the magnitude of the coefficients. This method is implemented in algorithms such as Ridge regression and Support Vector Machines (SVMs). Unlike L1 regularization, the weight value does not become zero when using L2 regularization, but it still tends toward zero. The L2 regularization consists in minimizing the L2 loss function, which can be expressed as follows:



Dropout regularization method works by randomly setting input units to 0 during the training time. By removing nodes from each layer, the Dropout layer is able to release the model from overfitting, as illustrated in the photo below.

Dropout layers
Figure2. Dropout layers. Source


Related articles
Object counting is a crucial task in computer vision that involves determining the number of objects in an image...
Computer vision is a critical component of self-driving cars, a hot topic in recent years. We examine this topic...
Deep Learning Electromagnetic
Artificial intelligence and deep learning have rapidly become influential technologies in various fields of science. In this article, we...
Deep fake systems have gained widespread attention in recent years due to their ability to generate convincing digital media...
The Jobs of the Future : A Look at the Jobs Threatened by Artificial Intelligence and New Jobs
The advent of artificial intelligence has been a game-changer in the tech world, with the potential to transform industries...
Smart farming and artificial intelligence
The fourth agricultural revolution is already under way with the adoption of smart farm technology such as artificial intelligence,...
Subscribe to our newsletter and get the latest practical content.

You can enter your email address and subscribe to our newsletter and get the latest practical content. You can enter your email address and subscribe to our newsletter.