0% found this document useful (0 votes)
4 views

Introduction Generative Adversarial Networks

The document discusses Generative Adversarial Networks (GANs), highlighting their ability to model data distributions and generate new samples, unlike traditional discriminative models. It explains the architecture of GANs, which consists of a generator and a discriminator that compete against each other to improve their performance. Additionally, it covers various applications of GANs, including image generation, style transfer, and biomedical segmentation.

Uploaded by

shreyash.ggv
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Introduction Generative Adversarial Networks

The document discusses Generative Adversarial Networks (GANs), highlighting their ability to model data distributions and generate new samples, unlike traditional discriminative models. It explains the architecture of GANs, which consists of a generator and a discriminator that compete against each other to improve their performance. Additionally, it covers various applications of GANs, including image generation, style transfer, and biomedical segmentation.

Uploaded by

shreyash.ggv
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Introduction

Generative Adversarial
Networks
Why Generative Models?
Why Generative Models?

• We have only seen discriminative models


Given an image X, predict a label Y
Estimate P(Y|X)

• Discriminative models have several key limitations


Can not model P(X), i.e. the probability of seeing a certain image
Thus can not sample from P(X) i.e. can not generate new images

• Generative models (in general) cope with all of above


Can model P(X)
Can generate new images
Why Generative Models?
• Generative modelling aims to
model the distribution that a
given set of data (e.g. images,
audio) came from.

Illustration of sampling a distribution


Generative Models
• We have a discriminator network that takes samples of true and generated
data and that tries to classify them as much as possible, and a generator
network that is trained to fool the discriminator by generating real looking
images.

• As the name "adversarial networks" implies both networks try to beat each
other and doing so, they are both becoming better and better.
Generative Models
• At each iteration of the training process, the weights of the generative network
are updated in order to increase the classification (error gradient ascent over
the generator's parameters)

• Whereas the weights of the discriminative network are updated so as to


decrease this error (error gradient descent over the discriminator's
parameters)
Adversarial Learning
• A training process where the two models try to weaken each other
and, as a result, improve each other is called adversarial learning.
Adversarial Learning
• Consider a game of "busting fake bill" (by police) and "making better
fake bill" (by criminals)
Adversarial Learning
• Consider a game of "busting fake bill" (by police) and "making better
fake bill" (by criminals)
GANs Background
• GANs are designed based on the idea of generative adversarial learning.

• It learns a generative model that can sample from a distribution without


explicitly estimating it.

• It was proposed by Goodfellow et. al in 2014.

• GANs have become the most thriving and popular method to synthesize audio
text, images, video, and 3D models in the ML community.
Applications
• Image Generation using Deep Convolution GANs

• Generation of Anime characters using GANs

• Sketch to Color photograph generation using GANs

• unpaired Image-to-Image translation using CycleGANs.

• Text-to-Image Synthesis with Stacked GAN

• Generation of new Human Poses using GANs.

• Single Image Super Resolution using GANs.

• GAN based Inpainting of photographs.


Which one is Computer Generated?

Which Face is real?

Which one is real and computer generated?


Image Generation

StyleGAN is used…
GAN Application: Complicated and
creative task
Single Image Resolution
Deepfakes... Privacy at stake
Image to Image Translation
Basics of GAN
It contains two different networks: a generator
network and a discriminator network

• The generator network typically takes random


noises as input and generates fake samples.

• The discriminator is, in fact, a classification


network, whose job is to tell whether a given
sample is fake or real.

• The generator tries its best to trick the


discriminator with the generated fake samples
as real, while the discriminator tries its best to
distinguish the fake samples from real ones.
GAN’s Architecture
Training Discriminator
Training Generator
Basics of GAN

• Convolutional Neural Networks for Deep Learning

ConvNet architectures make the explicit assumption that the inputs are
images and encodes certain properties into the architecture.

• Efficient to implement (suitable for GPU-based parallel implementation)


• Reduce the number of parameters (reducing computational complexity)
Basics of GAN
Basics of GAN
• Spatial arrangement: Unlike a regular Neural Network, the layers of a ConvNet have
neurons arranged in 3 dimensions: width, height, depth. This implementation fits well with
image data.

• Parameter sharing:
• Consider an image size e.g. 200 X 200 X 3, would lead to neurons that have 200 X 200
X 3 = 120,000 weights.
• This full connectivity is wasteful and the huge number of parameters would quickly
lead to overfitting.
• ConvNet uses filters for feature extraction across the image, that share parameters.
DCGAN

• Deep Convolutional GAN

• DCGANs were the first GAN model to learn to generate high resolution images in a single
shot.

• DCGANs are able to generate high quality images when trained on restricted domains of
images, such as images of bedrooms,

• DCGANs also clearly demonstrated that GANs learn to use their latent code in meaningful
ways, with simple arithmetic operations in latent space having clear interpretation.
DCGAN

The generator network used by a DCGAN


Bedroom images generated by DCGAN
DCGAN

DCGANs demonstrated that GANs can learn a distributed


representation that disentangles the concept of gender from
the concept of wearing glasses
Cycle-GAN

• Cycle Consistent GAN

• Task is to perform style transfer; transform a given image to match a different style.

• It is unlikely that we have lots of pairs of images in both styles (e.g. a photograph and a
van Gogh painting that matches it). So lets assume we have unpaired data, i.e.
collections of unrelated images in the two styles.

Goal of Cycle-GAN:
• To produce outputs which are plausible images of the target style, and
• To preserve the structure of the original images.
Cycle-GAN
Understanding Cycle GAN

• Imagine you learn a new word in a foreign language, and then you translate it
back to your native language to see if it still means the same thing.

• If the translation matches the original word then you’ve learned the translation
correctly.
How CycleGAN Work

• Original Image (Photo) → Generator G → Transformed Image (Painting)


• Transformed Image (Painting) → Generator F → Reconstructed
Image (Photo).

• In CycleGAN we treat the problem as an image reconstruction problem.


We first take an image input (x) and using the generator G to convert
into the reconstructed image.
• Then we reverse this process from reconstructed image to original
image using a generator F.
• Then we calculate the mean squared error loss between real and
reconstructed image.
CycleGAN
• The most important feature of this cycle GAN is that it can do this
image translation on an unpaired image where there is no relation
exists between the input image and output image.
Architecture of CycleGAN

• CycleGAN has two main parts: Generator and Discriminator just like
other GANs.

• The Generator’s job is to create images that look like they belong to a
certain style or category.

• The Discriminator’s job is to figure out if the image is real (from the
original style) or fake (created by the Generator).
• Generators:
CycleGAN has two generators—G and F.
• G transforms images from domain X (e.g., photos) to domain Y (e.g.,
artwork).
• F transforms images from domain Y back to domain X.
• The generator mapping functions are as follows:
• G:X→YF:Y→X G:X→YF:Y→X​
• where X is the input image distribution and Y is the desired output
distribution (such as Van Gogh styles)
CycleGAN
• Discriminators:
There are two discriminators—Dₓ and Dᵧ.
• Dₓ distinguishes between real images from X and generated images from F(y).

• Dᵧ distinguishes between real images from Y and generated images from G(x)

• To further regularize the mappings the CycleGAN uses two more loss function
in addition to adversarial loss.
Segmentation

• Biomedical Applications

• Glaucoma is a group of eye conditions that damage the optic nerve, the
health of which is vital for good vision.

• This damage is often caused by an abnormally high pressure in your eye.

• Glaucoma is one of the leading causes of blindness for people over the age
of 60. It can occur at any age but is more common in older adults.
Segmentation

• The optic disc (OD) is considered as one of the main features of a retinal
fundus image.

• The change in the shape, color or depth of OD is an indicator of various


ophthalmic pathologies especially for glaucoma.

• Can we accurately detect and segment the disc region?


Segmentation

Block diagram of disc region segmentation


Segmentation
Segmentation
Segmentation
Segmentation

You might also like