logo

Aiex.ai

3 Mins Read

The History of AI (part 2)

History of AI

The History of AI (part 2)

CNNs, transfer learning, and neural networks are common terms in the field of artificial intelligence. In this article, we will explore the history of these concepts and provide a chronological list of when they were first introduced to the world of AI.

  • Back-propagation (1986). Source
    Neural networks are trained using a technique called backpropagation, which adjusts the weights of the network based on the error or loss calculated during each iteration or epoch. Through this process of fine-tuning the weights, the neural network becomes more accurate and reliable in its predictions.
    Back propagation

 

  • Convolution layer (1989). Source
    The convolution layer is a fundamental component of convolutional neural networks (CNNs). First introduced in 1989, it allows CNNs to automatically learn a large number of filters in parallel, which are specific to a given training dataset. This capability is one of the most innovative aspects of CNNs and is key to their success in image and video recognition tasks.
    Convolution layer

 

  • Recurrent Neural Networks (1994, 1997). Source, Source
    Recurrent Neural Networks (RNNs) are a type of artificial neural network that have connections between nodes that form cycles. These cycles allow RNNs to consider inputs from other nodes when processing inputs and producing outputs, making them one of the first types of neural networks that can be applied to sequential data, such as natural language processing and speech recognition. RNNs were first introduced in 1994 and further developed in 1997.
    Recurrent Neural Networks Recurrent Neural Networks

 

  • Long short-term memory (LSTM) (1997). Source
    Long short-term memory (LSTM) is a type of recurrent neural network (RNN) introduced in 1997. LSTMs have feedback connections, which allow them to process entire sequences of data, rather than just single data points, making them well-suited for tasks such as speech recognition and machine translation. LSTMs have been shown to perform exceptionally well on a wide range of problems, and are widely used in various natural language processing and speech recognition tasks.
    Long short-term memory

 

  • ReLU activation function (2011). Source
    The ReLU (Rectified Linear Unit) activation function is a widely used activation function in neural networks, first introduced in 2011. It is a piecewise linear function that outputs 0 for negative input values and the input value directly for positive input values. It has become a popular choice as the default activation function in many neural networks because it is computationally efficient and often leads to faster training and better performance compared to other activation functions.
    ReLU activation function

 

  • Feature dropout (2012). Source
    Feature dropout is a regularization technique introduced in 2012 that helps to reduce overfitting in neural networks. It works by randomly omitting or “dropping out” half of the feature detectors on each training case. This prevents complex co-adaptations where a feature detector is only useful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Feature dropout has been shown to give significant improvements on many benchmark tasks and sets new records for speech and object recognition.
    Feature dropout

 

  • Transfer learning (2012). Source
    Transfer learning is a technique in machine learning that allows knowledge acquired from solving one problem to be applied to a related, but different problem. It was first introduced in 2012 and has since become a popular and important topic in machine learning research. The main idea behind transfer learning is to utilize the knowledge gained from a pre-trained model on a related task, and use it as a starting point for training a model on a new task, this way, we can save time and computational resources by reducing the amount of data and computational resources needed to train a new model from scratch.
    Transfer learning

 

  • R-CNN (2013). Source
    R-CNN (Region-based Convolutional Neural Network) is a machine learning model that was specifically designed for object detection in computer vision. It uses a combination of a convolutional neural network (CNN) and a region proposal algorithm to detect objects in an image.
    R-CNN

 

  • Adam Optimizer (2014). Source
    Adam Optimizer is a deep learning optimization algorithm that was first introduced in 2014, it has been widely adopted in various deep learning applications such as natural language processing and computer vision.
    Adam Optimizer

 

  • Dropout (2014). Source
    Dropout is a regularization technique for reducing overfitting in neural networks. It was introduced in 2014. The technique works by randomly setting input units to zero during each training step with a specified frequency (or “dropout rate”). This has the effect of reducing the dependency on any one feature, forcing the network to learn multiple redundant representations of the same feature, which in turn improves generalization. Dropout is widely used in deep learning and has proven to be effective in preventing overfitting and improving model performance.
    Dropout

 

  • YOLO (2015). Source
    You Only Look Once (YOLO) is a state-of-the-art object detection system, introduced in 2015. It uses a single convolutional neural network to directly predict bounding boxes and class probabilities in a single pass, making it significantly faster than traditional object detection systems. YOLO has been widely adopted in various applications such as self-driving cars, video surveillance, and video analysis.
    YOLO

 

  • Batch normalization (2015). Source
    Batch normalization is a technique to improve the stability and efficiency of training deep neural networks. It works by normalizing the input layer by adjusting and scaling the activations of the previous layer at each batch, reducing the internal covariate shift, which is the change in the distribution of the layer’s inputs. This results in faster convergence and improved performance of the model.
    Batch normalization

 

  • Attention (2017). Source
    The attention mechanism was first introduced in 2017 to aid in memorizing long source sentences in neural machine translation. Since its inception, it has become one of the most influential ideas in the field of Deep Learning and is now applied to various problems.Attention
Related articles
Ensemble-Machine-Learning
Ensemble machine learning is a powerful technique that leverages the strengths of multiple weak learning models, also known as...
Neural Network
Activation functions are the main components of neural network nodes. This article examines the various types of activation functions...
regularization
Regularization is a technique used in Machine Learning and Deep Learning models to prevent overfitting. This paper introduces L1,...
Machine Learning Engineers Should Use Docker
Docker is a platform that enables developers to easily create, deploy, and run applications in containers, and has gained...
TPU-GPU-CPU
In this paper, we compare the performance of CPU, GPU, and TPU processors to see which one is better...
computer vision datasets
This article reviews famous datasets in the field of computer vision. ...
Subscribe to our newsletter and get the latest practical content.

You can enter your email address and subscribe to our newsletter and get the latest practical content. You can enter your email address and subscribe to our newsletter.