logo

Aiex.ai

5 Mins Read

What is a Generative Adversarial Network (GAN)?

Generative Adversarial Network

What is a Generative Adversarial Network (GAN)?

In 2014, Ian J. Goodfellow and his co-authors introduced the Generative Adversarial Network (GAN). GANs perform unsupervised learning tasks in machine learning. Two models, known as generators and discriminators, are used to discover and learn the patterns in the input data.

GAN papers
Figure1. Overview of GAN papers, published in 2014. Source

Main components of a GAN

There are two essential components in a GAN that work in tandem to improve the network:

Generator: Typically, the Generator starts its work by receiving   Gaussian noise from the input. The next objective of the Generator is to generate images as close as possible to reality.

Discriminator: Tasked with distinguishing fake images from real images, the discriminator examines the images generated by the Generator and determines whether images look real enough. The discriminator  achieves this by comparing the dataset images and the images generated by the Generator.

In Short, a GAN strives to create images that look as natural as possible, and these images can often deceive humans and even the computer that created them into thinking they are real images.

Main components of GAN
Figure2. Main components of GAN.

 

The figure below shows an example of GAN training on handwritten digits (mnist). At each stage, the shapes start from noise  and change shape to ultimately form handwritten digits.

Training GAN
Figure3. Training GAN on mnist dataset.

The story of the police and the counterfeiter

It is customary to use the story of the police and the counterfeiter to explain the GAN. The main character of this story is a police officer (the Discriminator) who searches for counterfeit money and a counterfeiter (the Generator) who makes fake money.

At first, the counterfeiter knows nothing about how money is supposed to look, so  he will create terrible forgeries. The officer is equally terrible at recognizing fake money.

Herein, the officer is informed that a specific dollar bill is fake. Then, we show a real dollar bill and ask how it differs from the fake one. In order to distinguish the real one from the fake one, the officer looks for new details. The officer might notice, for example, that authentic money has a picture of a person while fake money does not. As a result, he can distinguish between real and fake money.

In the next step, we explain to the counterfeiter that his  money images are being rejected as fake and that they need to be improved.

This cycle repeats thousands of times until both networks become experts in this back-and-forth game. As time passes, the Generator produces near-perfect counterfeits, and the Discriminator becomes a Master Detective looking for the slightest anomalies.

Types of a GAN

1- Vanilla adversarial generative networks

(the original network introduced by Ian Goodfellow) (The Vanilla GAN)

2- Deep Convolutional Generative Adversarial Networks

A convolutional neural network is used as both a generator and a discriminator. For example, The network Nvidia developed generates images that are used to distinguish between real and false face images.

3- Conditional Generative Adversarial Networks

It is possible to generate a specific type of data using  the network. Consider the dataset of numbers 0 to 9 (mnist), each of which can generate random images of numbers in the usual adversarial generator network. The difference is that in this type of network, we can define a condition by feeding input C, so that it produces only the result we are looking for.

Difference between GAN and Conditional GAN
Figure4. Difference between GAN and Conditional GAN.

 

Discussion

One of the pioneers of artificial intelligence, Yann Lecun, called GAN the most interesting machine learning idea of the last ten years. Researchers have achieved great success with GANs in the last 8 years. The following are some high-quality fake images of faces generated using GAN.

Fake images of faces generated using GAN
Figure5. Fake images of faces generated using GAN. Source

In addition to generating images of faces, the GAN has countless other applications. Various GAN-generated images can be found online. Among them are animal images, paintings in the style of famous painters, caricatures, and so on. The following are some of the applications of the GAN.

  • High-resolution image synthesis
  • Text to image conversion
  • Video Generation
  • Face synthesis
  • Image inpainting
  • Image to Image (pix2pix)
  • Domain Transfer Network(DTN)
  • Texture synthesis
  • Face Aging
  • Image blending
  • Generating 3D objects
  • Creating Anime characters
  • Pose Guided Person Image Generation
  • PixelDTGAN
  • Super Resolution
Some applications
Figure6. Some applications of GAN; a) Face Aging, b) Texture synthesis, c) Domain Transfer Network(DTN), d) cycle GAN.

References

1. Goodfellow, Ian, et al. “Generative adversarial networks.” Communications of the ACM 63.11 (2020): 139-144.

2. Mirza, Mehdi, and Simon Osindero. “Conditional generative adversarial nets.” arXiv preprint arXiv:1411.1784 (2014).

3. Creswell, Antonia, et al. “Generative adversarial networks: An overview.” IEEE signal processing magazine 35.1 (2018): 53-65.

4. Gui, Jie, et al. “A review on generative adversarial networks: Algorithms, theory, and applications.” IEEE Transactions on Knowledge and Data Engineering (2021).

5. Mao, Xudong, and Qing Li. Generative adversarial networks for image generation. Springer Singapore, 2021.

6. Razavi-Far, Roozbeh, Ariel Ruiz-Garcia, and Vasile Palade. “An Introduction to Generative Adversarial Learning: Architectures and Applications.” Generative Adversarial Learning: Architectures and Applications. Springer, Cham, 2022. 1-6.

Related articles
History of AI
In the second part of a series of articles about the history of artificial intelligence, we look at important...
computer vision datasets
This article reviews famous datasets in the field of computer vision. ...
Dataset-Development-Lifecycle copy
Google has come up with a framework for data collection inspired by software development concepts in a 5-step cyclical...
From NLP to Computer Vision
In this article, we discuss the Attention mechanism and trace its history of use from natural language processing to...
TensorRT
TensorRT is a library developed by NVIDIA for faster inference on NVIDIA graphics processing units (GPUs). It can improve...
What is a Neural network
This article introduces the concept of Neural Networks in detail. We will compare neurons in human brains with artificial...
Subscribe to our newsletter and get the latest practical content.

You can enter your email address and subscribe to our newsletter and get the latest practical content. You can enter your email address and subscribe to our newsletter.