0% found this document useful (0 votes)
7 views9 pages

DL 5

Deep generative models are machine learning models that generate new data resembling their training data, such as images, text, and audio. They include various architectures like GANs, VAEs, and Boltzmann Machines, each with unique mechanisms for learning and generating data. Applications range from art generation and medical imaging to data augmentation and video prediction.

Uploaded by

ffhunter7666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

DL 5

Deep generative models are machine learning models that generate new data resembling their training data, such as images, text, and audio. They include various architectures like GANs, VAEs, and Boltzmann Machines, each with unique mechanisms for learning and generating data. Applications range from art generation and medical imaging to data augmentation and video prediction.

Uploaded by

ffhunter7666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Unit V: Deep Generative Models

🔷 What is a Generative Model?

A generative model is a type of machine learning model that can generate new data that looks like the data it was
trained on.

 Imagine you give a model lots of pictures of cats, and later it can create new pictures of cats that are completely
original — that's what a generative model does.

 It learns the patterns and structure of the training data, and then creates new samples that follow those
patterns.

🔷 Discriminative vs. Generative Models

Aspect Discriminative Model Generative Model

Goal Classify data (e.g., is it a cat or dog?) Generate data (e.g., create new cat images)

Learns Boundary between classes How the data is distributed

Example Logistic Regression, SVM, CNN GANs, VAEs, Boltzmann Machines

🔷 Deep Generative Models

A deep generative model is a generative model built using deep learning techniques (neural networks with many
layers).

These models can generate:

 Realistic images

 Natural-sounding speech

 Human-like text

 Even videos

🔷 Why Are They Called “Generative”?

Because they can generate new examples that look like the training data.

For example:

 If trained on human faces → it can generate realistic human faces.

 If trained on handwritten digits → it can create new digits like “2” or “7” that look handwritten.

🔷 How Do They Work? (Basic Idea)

A deep generative model works in these basic steps:

1. Input: Random noise (e.g., a vector of random numbers).

2. Generator Network: Transforms that noise into something meaningful (like an image).

3. Output: A new data sample that looks real.


🔸 Common Deep Generative Models:

1. Boltzmann Machines

2. Deep Belief Networks (DBN)

3. Variational Autoencoders (VAE)

4. Generative Adversarial Networks (GANs)

🔷 Simple Example:

Let’s say you train a deep generative model on images of handwritten digits (like the MNIST dataset).
After training:

 You give it a random input (noise).

 The model generates a new image that looks like a handwritten “4” or “8” — even though it’s not copied from
the training data.

🔷 Where Are Deep Generative Models Used?

 Art Generation: AI-generated paintings, faces, music.

 Face Generation: Deepfake videos.

 Text-to-Image: Generate images from text descriptions (e.g., "a cat riding a bike").

 Data Augmentation: Create synthetic data for training other models.

 Medical Imaging: Generate more examples of rare diseases for training.


Boltzmann Machine Explained in Simple Words

A Boltzmann Machine is a type of neural network used in deep learning. It is designed to learn patterns and
relationships in data without needing labels (unsupervised learning)24.

Key Points:

 It is made up of nodes (also called units or neurons). Each node can be either ON (1) or OFF (0)24.

 Every node is connected to every other node. These connections have weights that determine how strongly
nodes influence each other24.

 There are two types of nodes:

o Visible nodes: These are the input data you can observe.

o Hidden nodes: These are not directly seen but help the machine learn patterns in the data24.

 The network is called “stochastic” because the nodes make random (probabilistic) decisions about whether to
turn ON or OFF24.

 The goal is to find patterns or features in the data by adjusting the connection weights.

How Does It Work?

 The Boltzmann Machine tries to find a balance (like thermal equilibrium in physics) where the overall “energy”
of the network is minimized. Lower energy means the network has learned good patterns from the data23.

 It does this by repeatedly turning nodes ON or OFF and updating the weights based on how well the network
represents the input data23.

 The learning process is slow but helps the machine discover interesting features in the data2.

Simple Example:
Suppose you have a dataset of black-and-white images (each pixel is either black or white). You feed these images into
the visible nodes. The hidden nodes try to learn patterns, such as “vertical lines” or “corners.” After learning, the
Boltzmann Machine can generate new images that follow the same patterns as your original images.
Applications:

 Finding important features in data (feature learning)

 Reducing the number of variables in data (dimensionality reduction)

 Filling in missing parts of data (pattern completion)

 Helping with classification tasks after learning features4

🔷 Now Technically (But Still Simple):

A Boltzmann Machine is:

 A type of neural network

 That learns patterns in the data

 By assigning a value called energy to each pattern

 Low energy = likely pattern

 High energy = unlikely pattern

It keeps adjusting itself (its connections and weights) until it gives lowest energy to the patterns it sees often in your
data.

🔷 Why Do We Use It?

Because it can:

 Learn hidden patterns in the data (like what's common in your examples)

 Generate new samples that are like the real ones

 Be used in recommendation systems, feature extraction, and as the base for Deep Belief Networks

🔷 Example in Real Life:

Imagine Netflix wants to recommend movies.

 They have data: which users watched which movies.

 The Boltzmann Machine learns patterns in this data (e.g., users who like sci-fi also like action).

 Then it can predict what movies a new user might like — even if they haven’t watched anything yet.

3. Deep Belief Networks (DBNs)

🔷 What is a Deep Belief Network?

A Deep Belief Network (DBN) is a deep generative model made by stacking multiple Restricted Boltzmann Machines
(RBMs) on top of each other.

It learns to recognize patterns and generate data similar to what it was trained on — like handwritten digits, images, etc.

🔷 How is DBN built?

A DBN is built like a layered sandwich of neural networks:


Input Layer → RBM 1 → RBM 2 → RBM 3 → ... → Output Layer

Each RBM learns to extract features from the previous layer.

🔷 Quick Recap: What is an RBM?

 Restricted Boltzmann Machine (RBM) is a simple neural network with:

o One visible layer (input data)

o One hidden layer (learns patterns)

 It learns patterns in data and can reconstruct or generate similar data.

🔷 DBN = Stack of RBMs

Each RBM learns features and passes those to the next layer:

 First RBM learns low-level features (like edges in an image)

 Second RBM learns higher-level features (like shapes)

 Third RBM learns even more abstract features (like object identity)

This process is called unsupervised layer-wise training.

🔷 Why use Deep Belief Networks?

Because they can:

 Learn from unlabeled data (unsupervised learning)

 Understand deep features of the input

 Be used for both generative tasks (generate new data) and discriminative tasks (classification)

🔷 How DBN Learns? (Simplified Steps)

1. Train the first RBM using input data.

2. Use its hidden layer as input to train the second RBM.

3. Repeat step 2 for more layers (stacking).

4. Fine-tune the entire network using supervised learning (optional).

🔷 Example: Handwritten Digit Recognition

Suppose you train a DBN on the MNIST dataset (images of digits 0–9):

1. First RBM: Learns edges or pixel patterns.

2. Second RBM: Learns shapes like loops or curves.

3. Third RBM: Learns digit identities (like 0, 3, 9).

4. You can now use this model to recognize digits or generate new digit images.
🔷 Applications of DBNs

 Image recognition

 Speech recognition

 Data generation

 Dimensionality reduction

 Pretraining for deep networks

Generative Adversarial Network (GAN) Explained Simply

A Generative Adversarial Network (GAN) is a special type of deep learning model used to create new, realistic data that
looks similar to the data it was trained on. GANs are famous for generating images, but they can also create music, text,
and more.

How does a GAN work?

A GAN has two parts, both are neural networks:

 Generator: This network tries to create fake data that looks real. For example, it might try to make a fake photo
that looks like a real person.

 Discriminator: This network tries to tell the difference between real data (from the training set) and fake data
(from the generator).

These two networks compete with each other in a kind of game:

 The generator tries to fool the discriminator by making better and better fake data.

 The discriminator tries to get better at spotting which data is fake and which is real.

This process continues until the generator gets so good that the discriminator can no longer easily tell the difference
between real and fake data12356.

Simple Example

Suppose you have a dataset of real photos of cats:

 The generator creates a new image (starting with random noise).

 The discriminator looks at both real cat photos and the fake image and tries to guess which is which.

 If the discriminator is correct, the generator tries to improve.

 If the generator fools the discriminator, it gets rewarded.

 Over time, the generator learns to create very realistic cat photos, even though those exact photos never existed
before.

Summary Table

Part What it Does

Generator Makes fake data (tries to look real)

Discriminator Judges data as real or fake

In Simple Words:
A GAN is like a forger (generator) and a detective (discriminator) competing. The forger tries to make fake art that fools
the detective, and the detective tries to catch the fakes. Over time, both get better at their jobs, and the forger ends up
making very convincing art.
Discriminator Network (in GANs) Explained Simply

A discriminator network is one of the two main parts of a Generative Adversarial Network (GAN). Its job is to tell the
difference between real data (from the training set) and fake data (created by the generator network).

How does the discriminator work?

 It receives both real and generated (fake) data as input.

 It uses deep learning techniques to analyze the data and decide if it is real or fake.

 The output is usually a single value: closer to 1 means "real," closer to 0 means "fake"5.

 It acts like a judge in a contest, constantly improving its ability to spot fake data as training continues34.

Why is the discriminator important?

 It provides feedback to the generator, helping it get better at making realistic data.

 The competition between the generator and discriminator pushes both networks to improve, leading to more
convincing fake data over time135.

Example:
If you are training a GAN to generate images of handwritten digits:

 The discriminator looks at a mix of real digit images and fake ones from the generator.

 It tries to correctly label each image as "real" or "fake."

 The better the discriminator gets, the harder the generator has to work to fool it46.

Generator Network (in GANs) Explained Simply

A generator network is one of the two main parts of a Generative Adversarial Network (GAN). Its main job is to create
new data that looks as real as possible, based on what it has learned from real data.

How does the generator work?

 The generator starts with random input, often called “noise” (just a bunch of random numbers)36.

 It uses a neural network to transform this noise into a data sample, such as an image, that tries to look like it
came from the real dataset36.

 At first, the generator’s outputs are usually not very realistic.

 The generator sends its fake data to the discriminator, which tries to tell if it’s real or fake15.

 If the discriminator easily spots the fake, the generator gets feedback and learns to improve.

 Over many rounds, the generator gets better at making data that can fool the discriminator135.

Example:
If you want a GAN to generate pictures of faces:

 The generator takes random noise and tries to turn it into a picture that looks like a real face.

 The discriminator checks if the picture is real or fake.

 The generator keeps learning and gets better at making faces that look real.

Key Points:

 The generator’s goal is to make fake data that is so realistic, the discriminator can’t tell it’s fake.
 It learns by getting feedback from the discriminator and improving its own network135.

 Over time, the generator can produce very convincing data, like realistic images, sounds, or text.

Types of GANs (Generative Adversarial Networks)

There are several types of GANs, each designed for different tasks or improvements over the basic model. Here are some
of the most important and commonly used types:

1. Vanilla GAN

 This is the original and simplest form of GAN.

 It uses basic neural networks for both the generator and discriminator.

 It is mainly used for understanding and experimenting with the basic GAN concept124.

2. Conditional GAN (cGAN)

 In a conditional GAN, both the generator and discriminator receive extra information, like labels or specific
conditions.

 This allows the GAN to generate data that matches a certain condition, such as creating images of a specific type
(e.g., only cats or only dogs)1245.

3. Deep Convolutional GAN (DCGAN)

 DCGAN uses convolutional neural networks (ConvNets), which are especially good for image data.

 These networks help the GAN generate more realistic and higher-quality images1247.

4. Super-Resolution GAN (SRGAN)

 SRGAN is designed to take low-resolution images and generate high-resolution, detailed versions of them.

 It is useful for improving the quality of images, such as making blurry images clearer126.

5. CycleGAN

 CycleGAN is used to transform images from one style to another without needing paired examples.

 For example, it can turn photos of horses into photos of zebras, or summer landscapes into winter ones367.

6. StyleGAN

 StyleGAN is designed for generating very high-quality, realistic images, such as faces that look real but do not
belong to any real person.

 It allows for control over the “style” of the generated images, such as age, hair color, or facial expression37.

7. Laplacian Pyramid GAN (LAPGAN)

 LAPGAN uses a pyramid structure to generate images in stages, starting from low resolution and adding more
details at each stage.

 This helps in creating high-quality images by refining them step by step27.

Applications of GAN Networks

Generative Adversarial Networks (GANs) have a wide range of real-world applications across many fields. Here are some
of the most important and common uses:

1. Image Generation
 GANs can create realistic images from scratch, including human faces, animals, objects, and scenes that do not
exist in reality1236.

2. Image-to-Image Translation

 They can convert images from one style or domain to another, such as turning sketches into colored photos,
black-and-white images into color, or day scenes into night scenes346.

3. Text-to-Image Translation

 GANs can generate images based on text descriptions, such as creating a picture of “a red bird sitting on a
branch” from that sentence346.

4. Super Resolution

 GANs can take low-resolution images and generate high-resolution versions, making blurry or pixelated photos
clearer and sharper345.

5. Video Prediction and Generation

 They can predict future frames in a video or even generate entirely new video sequences, which is useful in
animation, video editing, and surveillance1346.

6. 3D Object Generation

 GANs can create 3D models of objects from images or sketches, useful in design, gaming, and virtual reality136.

7. Data Augmentation

 They can generate extra training data for machine learning models, especially when real data is limited, such as
creating synthetic medical images or rare event samples for fraud detection5.

8. Face Editing and Manipulation

 GANs are used for face aging, changing facial expressions, swapping faces, or editing features in photos234.

9. Style Transfer

 They can apply the artistic style of one image (like a painting) to another image, such as making a photo look like
it was painted by Van Gogh6.

10. Text, Speech, and Audio Generation

 GANs can generate lifelike speech from text (text-to-speech), create new music, or synthesize realistic sound
effects6.

11. Photo Inpainting

 Filling in missing or damaged parts of an image, such as restoring old photos or removing unwanted objects from
pictures34.

12. Deepfake Creation

 GANs can generate highly realistic fake videos or images, such as making someone appear to say or do
something they never did. This has both creative and ethical implications6.

13. Business and Design

 Used in industrial design, interior design, clothing, and product prototyping by generating new design ideas and
visualizations4.

You might also like