0% found this document useful (0 votes)
5 views

Exp-3 Minor- AAI_e95fb77274b3a46cdecffb55b3392f6c

The document outlines an experiment on implementing a Deep Convolutional Generative Adversarial Network (DCGAN) for image generation using the MNIST dataset. It explains the theory behind GANs, the need for DCGANs to prevent mode collapse, and the architectural changes that enhance training stability. The conclusion confirms the successful implementation of the DCGAN model for generating handwritten digits.

Uploaded by

harshmulik03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Exp-3 Minor- AAI_e95fb77274b3a46cdecffb55b3392f6c

The document outlines an experiment on implementing a Deep Convolutional Generative Adversarial Network (DCGAN) for image generation using the MNIST dataset. It explains the theory behind GANs, the need for DCGANs to prevent mode collapse, and the architectural changes that enhance training stability. The conclusion confirms the successful implementation of the DCGAN model for generating handwritten digits.

Uploaded by

harshmulik03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Class: - B.E.D.S.

Semester: - VIII
Subject: Advanced Artificial Intelligence A.Y: - 2024-25
Experiment - 3

Deep Convolutional GAN(DCGAN)

Aim: Implement a Deep Convolution Generative Multilayer (DCGAN) Network Model for an image-
based dataset.

Theory:

What are GANs?


Generative Adversarial Networks (GANs) are one of the most interesting ideas in computer science
today. Two models are trained simultaneously by an adversarial process. A generator ("the artist") learns
to create images that look real, while a discriminator ("the art critic") learns to tell real images apart from
fakes.

During training, the generator progressively becomes better at creating images that look real, while the
discriminator becomes better at telling them apart. The process reaches equilibrium when the
discriminator can no longer distinguish real images from fakes.
Class: - B.E.D.S. Semester: - VIII
Subject: Advanced Artificial Intelligence A.Y: - 2024-25

Here we demonstrate this process on the MNIST dataset. The following animation shows a series of
images produced by the generator as it was trained for 50 epochs. The images begin as random noise,
and increasingly resemble hand written digits over time.

Deep Convolutional GAN (DCGAN) was proposed by a researcher from MIT and Facebook AI
research. It is widely used in many convolution-based generation-based techniques. It focuses, to make
training GANs stable. Hence, DCGAN comes up with some proposed architectural changes in the
computer vision problems. In this article, we will be using DCGAN on the fashion MNIST dataset to
generate images related to clothes.

Need for DCGANs:

DCGANs are introduced to reduce the problem of mode collapse. Mode collapse occurs when the
generator got biased towards a few outputs and can’t able to produce outputs of every variation from the
dataset. For example- take the case of mnist digits dataset (digits from 0 to 9) , we want the generator
should generate all type of digits but sometimes our generator got biased towards two to three digits and
produce them only. Because of that the discriminator also got optimized towards that particular digits
only, and this state is known as mode collapse. But this problem can be overcome by using DCGANs.

Architecture:

DCGAN, or Deep Convolutional GAN, is a generative adversarial network architecture. It uses a couple
of guidelines, in particular:
• Replacing any pooling layers with strided convolutions (discriminator) and fractional-strided
convolutions (generator).
• Using batchnorm in both the generator and the discriminator.
• Removing fully connected hidden layers for deeper architectures.
• Using ReLU activation in generator for all layers except for the output, which uses tanh.
• Using LeakyReLU activation in the discriminator for all layer.
Class: - B.E.D.S. Semester: - VIII
Subject: Advanced Artificial Intelligence A.Y: - 2024-25

The generator of the DCGAN architecture takes 100 uniform generated values using normal distribution
as an input. First, it changes the dimension to 4x4x1024 and performed a fractionally stridden
convolution 4 times with a stride of 1/2 (this means every time when applied, it doubles the image
dimension while reducing the number of output channels). The generated output has dimensions of (64,
64, 3). There are some architectural changes proposed in the generator such as the removal of all fully
connected layers, and the use of Batch Normalization which helps in stabilizing training. In this paper,
the authors use ReLU activation function in all layers of the generator, except for the output layers.

The role of the discriminator here is to determine that the image comes from either a real dataset or a
generator. The discriminator can be simply designed similar to a convolution neural network that
performs an image classification task. However, the authors of this paper suggested some changes in the
discriminator architecture. Instead of fully connected layers, they used only strided-convolutions with
LeakyReLU as an activation function, the input of the generator is a single image from the dataset or
generated image and the output is a score that determines whether the image is real or generated.

Conclusion:
Thus we have implemented the Deep Convolution Generative Multilayer (DCGAN) Network Model for an
image-based MNIST handwritten digits dataset.

You might also like