0% found this document useful (0 votes)

20 views

DGM 2023 Endterm Solution

Uploaded by

Ghaith Chebil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

DGM 2023 Endterm Solution

Uploaded by

Ghaith Chebil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Data Analytics & Machine Learning

Informatics
Technical University of Munich

Note:
• During the attendance check a sticker containing a unique code will be put on this exam.
Esolution • This code contains a unique number that associates this exam with your registration
Place student sticker here number.
• This number is printed both next to the code and to the signature field in the attendance
check list.

Advanced Machine Learning: Deep Generative Models

n
Exam: CIT4230003 / Endterm Date: Monday 31st July, 2023
Examiner: Time:

tio
Prof. Dr. Stephan Günnemann 13:30 – 14:30

P1 P2 P3 P4

lu
So
Working instructions
• This exam consists of 12 pages with a total of 4 problems.
Please make sure now that you received a complete copy of the exam.

• The total amount of achievable credits in this exam is 28 credits.

• Detaching pages from the exam is prohibited.
e

• Allowed resources:
– one A4 sheet of handwritten notes, two sides.
pl

• No other material (e.g. books, cell phones, calculators) is allowed!

• Physically turn off all electronic devices, put them into your bag and close the bag.
m

• There is scratch paper at the end of the exam (after problem 10).
• Write your answers only in the provided solution boxes or the scratch paper.

• If you solve a task on the scratch paper, clearly reference it in the main solution box.
Sa

• All sheets (including scratch paper) have to be returned at the end.

• Only use a black or a blue pen (no pencils, red or greens pens!)

• For problems that say “Justify your answer” you only get points if you provide a valid explana-
tion.
• For problems that say “Derive” you only get points if you provide a valid mathematical derivation.
• For problems that say “Prove” you only get points if you provide a valid mathematical proof.

• If a problem does not say “Justify your answer”, “Derive” or “Prove”, it is sufficient to only provide the
correct answer.

Left room from to / Early submission at

– Page 1 / 12 –
Problem 1 Normalizing flows (5 credits)
In this task will focus on the reverse parametrization for normalizing flows on Rd .

0 a) Let x ∈ R2 and the transformation is defined as follows:

1
A = aT a
2
z = σ (A x),

where a ∈ R1>×02 and σ is the element-wise sigmoid activation.

Please state whether this transformation leads to a valid normalizing flow. Justify your answer accordingly.

n
No, this transformation is not invertible. Trivially A does not have full rank. Therefore, the determinant
of A is zero and the transformation is non-invertible.

tio
lu
So
0 b) Now, let x ∈ R3 and the transformation is defined as follows:

1
z1 = (x3 + x2 )3
2
z2 = x14 x2 + x3
e

z3 = e x3 .
pl

Please state whether this transformation leads to a valid normalizing flow. Justify your answer accordingly.

No, this is not a valid transformation. To disprove the bijectivity of the transformation, we can find a
counter example. For any assignment of x , we can choose a different assignment that maps to the
m

same z by replacing x1 with −x1 .

– Page 2 / 12 –
c) Lastly, let’s assume you are given a transformation f : R → R, where we know that the Jacobian 0
determinant of its inverse is equal to 1. How does this affect the normalizing flow?
1
−1
Please use the change of variable formula and a possible parametrization of f to explain.

A flow which Jacobian determinant is equal to 1 is volume preserving. This means that the transformed
distribution has the same volume, i.e., the change of variable formula is given by p2 (x) = p1 (f −1 (x)) ∗ 1.
Thus, a normalizing flow with a Jacobian determinant of one is not expressive and can not model any
other distribution than p1 (z).
A trivial example is f −1 (x) = x + b , where b ∈ R, which includes the identity map and any translation.

n
tio
lu
So
e
pl
m
Sa

– Page 3 / 12 –
Problem 2 Variational Inference & Variational Autoencoder (9 credits)
We want to draw samples from a log-normal distribution log N (µ, σ 2 ), where µ, σ ∈ R, with reparametrization.
The probability density function of the log-normal distribution is defined as:

(ln z −µ)2
(
1
√ exp − 2 σ 2 if z > 0
qµ,σ2 (z) = z σ 2π
0 otherwise
Its cumulative density function is given as:
a
ln a − µ
Z
1
Qµ,σ2 (a) = Pr(z ≤ a) = qµ,σ (z)dz = 1 + erf √
−∞ 2 σ 2
Rz
Recall that the error function erf(z) is an invertible function that is defined as erf(z) = √2 exp(−t 2 )dt .
π 0

n
0 a) Suppose you have access to an algorithm that produces samples ϵ from a standard normal distribution
N (0, 1). Find a deterministic transformation T : R → R>0 that transforms a sample ϵ ∼ N (0, 1) into a sample
1 from the log-normal distribution log N (0, 1).

tio
Hint: The cumulative density function of a normal distribution N (µ, σ 2 ) is given as:
2

1 a−µ
3 Fµ,σ2 (a) = Pr(z ≤ a) = 1 + erf √
2 σ 2

lu
We want to find T such that Pr(T (ϵ) ≤ a) = Q0,1 (a)

Pr(T (ϵ) ≤ a) = Pr(ϵ ≤ T −1 (a))

= F0,1 (T −1 (a))
So
T −1 (a)

1
= 1 + erf √
2 2
!
= Q0,1 (a)

1 ln a
= 1 + erf √
2 2

Since erf(z) is invertible, we just have to match the arguments of the error function.
e

T −1 (a) = ln(a) ⇒ T (ϵ) = exp(ϵ)

pl
m

0 b) Now suppose you have access to an algorithm that produces samples z from a log-normal distribution
∼ log N (0, 1). Find a deterministic transformation Mµ,σ2 : R>0 → R>0 that transforms a sample z ∼ log N (0, 1)
1 into a sample from the log-normal distribution log N (µ, σ 2 ).
Sa

3
We want to find Mµ,σ2 such that Pr(Mµ,σ2 (z) ≤ a) = Qµ,σ2 (a)

Pr(Mµ,σ2 (z) ≤ a) = Pr(z ≤ Pr(Mµ−,1σ2 (a))

= Q0,1 (Mµ−,1σ2 (a))
ln Mµ−,1σ2 (a)
" !#
1
= 1 + erf √
2 2
!
= Qµ,σ2 (a)

1 ln a − µ
= 1 + erf √
2 σ 2

– Page 4 / 12 –
Similar to before, we match the arguments of the error function.

ln a − µ
ln Mµ−,1σ2 (a) =
σ
σ ln Mµ−,1σ2 (a) + µ = ln a
Mµ−,1σ2 (a)σ exp(µ) = a
⇒ Mµ,σ2 (z) = z σ exp(µ)

n
tio
c) Now suppose you have access to an algorithm that produces samples ϵ from a standard normal distribution 0

lu
N (0, 1). Find a deterministic transformation Cµ,σ2 : R → R>0 that transforms a sample ϵ ∼ N (0, 1) into a
sample from the log-normal distribution log N (µ, σ 2 ). 1

Hint: Use the results from the previous subproblems.

So
We simply compose the transformations T which provides samples from log N (0, 1) and Mµ,σ2 which
transforms them into samples from log N (µ, σ 2 ).

Cµ,σ2 (ϵ) = Mµ,σ2 ◦ T (ϵ) = Mµ,σ2 ((T (ϵ)) = exp(ϵ)σ exp(µ) = exp(σϵ + µ)

e
pl
m
Sa

– Page 5 / 12 –
0 d) We want to model the distribution of data samples p(x) using a Variational Autoencoder. Recall that this
assumes a latent variable structure p(x, z) = p(x |z)p(z) and we need to model the distribution pθ (x |z) and the
1 variational distribution qϕ (z) respectively. We learn the parameters of our model by optimizing the ELBO:
2
L(θ, ϕ) = E log pθ (x |z) − KL qϕ (z) | p(z)
z ∼qϕ (z)

Here, KL is the Kullback-Leibler divergence KL p(z)∥q(z) = p(z) log p(z)

R
q(z)
dz . For simplicity, assume that
the latent variable z is scalar.
Instead of assuming a standard normal prior p(z) = N (0, 1) on the latent variable z , we want to employ a
log-normal prior p(z) = log N (0, 1). Argue why parametrizing qϕ (z) as a normal distribution N (µ, σ 2 ) is not a
practical idea. Furthermore, propose an alternative suitable parametrization and briefly outline how we can
backpropagate through sampling from qϕ (z).
Hint: You may refer to the procedure of c), even if you could not derive it.

n
The KL-divergence KL(qϕ (z) | p(z)) is defined as:

tio
Z
qϕ (z)
KL qϕ (z) | p(z) = qϕ (z) log dz (2.1)
R p(z)
Since p(z) will be zero for z ≤ 0 while qϕ (z) > 0, the KL-divergence term diverges to ∞, which prevents
gradient-based optimization.
If we instead parametrize qϕ (z) = log N (µ, σ 2 ), both arguments of the KL-divergence have the same
support, ensuring finite values. By employing the reparametrization scheme of c), we can backpropa-

lu
gate through sampling from qϕ (z).
So
e
pl
m
Sa

– Page 6 / 12 –
Problem 3 Generative Adversarial Networks (8 credits)
1
For π = 2
, GANs are trained by optimizing the model parameters θ according to
1
min max E [log Dϕ (x)] + 21 E [log(1
2 p ∗ (x)
− Dϕ (fθ (z)))] .
θ ϕ p(z)
| {z } | {z }
E1 E2

a) Based on this training objective, explain in one sentence each 0

• the meaning of the first expected value E1 , 1

• the meaning of the second expected value E2 , 2

• and what is adversarial about this formulation. 3

n
tio
• E1 : Rewards the discriminator for recognizing samples from the data distribution
• E2 : Rewards the discriminator for rejecting samples from the generated distribution and the
generator for fooling the discriminator
• Discriminator and generator are adversaries because they optimize the same objective in

lu
opposite directions
So
e
pl
m
Sa

– Page 7 / 12 –
0 b) Show that the loss
L = max 12 ∗E [log Dϕ (x)] + 1
E [log(1
2 p(z)
− Dϕ (fθ (z)))]
ϕ p (x)
1
from the GAN objective is equivalent to the Jensen-Shannon divergence (JSD) between the data distribution
2
p ∗ and the learned, generated distribution pθ , i.e.
3
L = JSD(p ∗ , pθ ) + c
4
for some constant c ∈ R that does not depend on p ∗ or θ. The JSD between two probability densities p and
5 q is defined as
JSD(p, q) = 12 KL p ∥ 21 (p + q) + KL q∥ 12 (p + q)

where KL is the Kullback-Leibler (KL) divergence KL(p ∥q) = Ep log pq .

Hint: Remember the general form of the optimal discriminator.

n
Hint: For GANs, it holds for functions h that

tio
E [h(fθ (z))] = E [h(x)].
p(z) pθ (x)

The optimal discriminator is given by

p ∗ (x)

lu
Dϕ∗ (x) = .
p ∗ (x) + pθ (x)

L = max 12 ∗E [log Dϕ (x)] + 1

So
E [log(1
2 p(z)
− Dϕ (fθ (z)))] (3.1)
ϕ p (x)
1 1
= max E [log Dϕ (x)]
2 p ∗ (x)
+ E [log(1
2 p (x)
− Dϕ (x))] (3.2)
ϕ θ

Now plug in the optimal discriminator.

1 p ∗ (x) 1
h p ∗ (x) i
= E [log
2 p ∗ (x)
] + E log 1 − (3.3)
p ∗ (x) + pθ (x) 2 pθ (x) p ∗ (x) + pθ (x)
e

1 p ∗ (x) 1
h pθ (x)
i
= 2
E [log ] + E log (3.4)
p ∗ (x) p ∗ (x) + pθ (x) 2 p (x)
θ p ∗ (x) + pθ (x)
pl

∗
1 p (x) 1
h pθ (x) i
= E
2 p (x)
∗
[log ∗
p (x)+p (x)
θ
] + 2
E log ∗
p (x)+p (x) θ
− log(2) (3.5)
pθ (x)
2 2
p ∗ (x) + pθ (x) 1 p ∗ (x) + pθ (x)
= 12 KL p ∗ ∥ + 2 KL pθ ∥ − log(2) (3.6)
2 2
m

= JSD(p ∗ , pθ ) − log(2) (3.7)

– Page 8 / 12 –
Problem 4 Denoising Diffusion (6 credits)
Consider a denoising diffusion model with N diffusion steps and the usual forward parametrization qφx 0 and
reverse process pθ .

n
Y 1 − ᾱn−1
αn = 1 − βn ᾱn = αi β̃n = βn
1 − ᾱn
i=1
√
qφ(x 0 ) (z n ) = N ᾱn x 0 , (1 − ᾱn I)
√ √ !
αn (1 − ᾱn−1 ) ᾱn−1 βn
qφ(x 0 ) (z n−1 | zn) = N zn + x 0 , β̃n I
1 − ᾱn 1 − ᾱn
√
z n − 1 − ᾱn ϵθ (z n , n)
x0 = √
ᾱn

n
a) Why do we optimize the ELBO instead of the data log-likelihood? 0

tio
1
The data log-likelihood log p(x) requires us to marginalize out the latent variables z 1 , ... , z N which is
intractable.

lu
So
b) Why does model training fail if βn > 1? 0

1
If βn > 1, αn = 1 − βn < 0 which means that ᾱn is going to oscillate in sign with at least one αn < 0.
Then we would have to take the square root of a negative number during training when sampling
z n ∼ qϕ(x 0 ) (z n ).
e
pl

c) Why does model training fail if βn = 1?

1
If βn = 1, we get αn = 1 − βn = 0 and therefore ᾱn′ = 0 for n′ ≥ n. In training this would mean that all
information
√ from x 0 would be lost from the n-th step on. During sampling, we would also have to divide
Sa

by 0 = 0 when estimating x 0 from z n .

– Page 9 / 12 –
0 d) Which of these beta schedules are invalid? Justify your answer.

1. βn = sin Nn

1

1
2 2. βn = 1 − n
n

3. βn = loge 1 + N

πn

4. βn = − cos N

1
Schedule 2 is invalid because β1 = 1 − 1
= 0. Schedule 4 is invalid because βn ≤ 0.

n
tio
lu
So
e
pl
m
Sa

– Page 10 / 12 –
Additional space for solutions–clearly mark the (sub)problem your answers are related to and strike
out invalid solutions.

n
tio
lu
So
e
pl
m
Sa

– Page 11 / 12 –
n
tio
lu
So
e
pl
m
Sa

– Page 12 / 12 –

All Tasks
No ratings yet
All Tasks
7 pages
Solution Manual For PRML
No ratings yet
Solution Manual For PRML
253 pages
Assignment Week 12-Deep-Learning PDF
100% (3)
Assignment Week 12-Deep-Learning PDF
6 pages
Mlgs 2021 Endterm Solution
No ratings yet
Mlgs 2021 Endterm Solution
26 pages
Mlgs 2021 Retake
No ratings yet
Mlgs 2021 Retake
54 pages
Mod4_Slides
No ratings yet
Mod4_Slides
49 pages
Khan - Diffusion Models and Normalizing Flows
No ratings yet
Khan - Diffusion Models and Normalizing Flows
36 pages
Paper-4-S2-2324
No ratings yet
Paper-4-S2-2324
5 pages
2022-exam2-solution
No ratings yet
2022-exam2-solution
10 pages
AI60201_module3_4_problems (1)
No ratings yet
AI60201_module3_4_problems (1)
4 pages
Lecture14
No ratings yet
Lecture14
23 pages
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 4: Solutions
No ratings yet
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 4: Solutions
8 pages
STAT3902 (24-25) Assignment 2 Solution
No ratings yet
STAT3902 (24-25) Assignment 2 Solution
5 pages
L20-GenerativeModels
No ratings yet
L20-GenerativeModels
53 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
DiffusionModel DDPM
No ratings yet
DiffusionModel DDPM
52 pages
hw8 (5555)
No ratings yet
hw8 (5555)
3 pages
PRML Solution Manual
No ratings yet
PRML Solution Manual
253 pages
Chapter 5
No ratings yet
Chapter 5
140 pages
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
No ratings yet
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
6 pages
ПМиИИ Демо ENG
No ratings yet
ПМиИИ Демо ENG
11 pages
exercise 01 math refresher
No ratings yet
exercise 01 math refresher
4 pages
Machine 2020 Jul-Dec Practice 7,8
No ratings yet
Machine 2020 Jul-Dec Practice 7,8
37 pages
msqe_metrics_1_ps2
No ratings yet
msqe_metrics_1_ps2
11 pages
Ffjord: F - C D S R G M: REE Form Ontinuous Ynamics For Calable Eversible Enerative Odels
No ratings yet
Ffjord: F - C D S R G M: REE Form Ontinuous Ynamics For Calable Eversible Enerative Odels
13 pages
M5A42 Applied Stochastic Processes Problem Sheet 1 Solutions Term 1 2010-2011
No ratings yet
M5A42 Applied Stochastic Processes Problem Sheet 1 Solutions Term 1 2010-2011
8 pages
M5A42 Applied Stochastic Processes Problem Sheet 1 Solutions Term 1 2010-2011
No ratings yet
M5A42 Applied Stochastic Processes Problem Sheet 1 Solutions Term 1 2010-2011
8 pages
281A Final Sol
No ratings yet
281A Final Sol
9 pages
MAST20004 14 Assign4 PDF
No ratings yet
MAST20004 14 Assign4 PDF
2 pages
Latent 2
No ratings yet
Latent 2
4 pages
ACV - Notes - Final
No ratings yet
ACV - Notes - Final
7 pages
STAT 135 Solutions To Homework 4:: 30 Points
No ratings yet
STAT 135 Solutions To Homework 4:: 30 Points
9 pages
test (2)
No ratings yet
test (2)
13 pages
Solution Test2 Summer2021
No ratings yet
Solution Test2 Summer2021
13 pages
hw4_red
No ratings yet
hw4_red
6 pages
Final Exam Practice Problems
No ratings yet
Final Exam Practice Problems
8 pages
Queuing Theory PDF
No ratings yet
Queuing Theory PDF
38 pages
homework1
No ratings yet
homework1
3 pages
CS 215: Data Analysis and Interpretation: Sample Questions
No ratings yet
CS 215: Data Analysis and Interpretation: Sample Questions
10 pages
L5 Normal Equations For Regression PDF
No ratings yet
L5 Normal Equations For Regression PDF
20 pages
Assignment
No ratings yet
Assignment
16 pages
exercise0_solution
No ratings yet
exercise0_solution
9 pages
Assignment_12_2022
No ratings yet
Assignment_12_2022
7 pages
Quiz3_2024
No ratings yet
Quiz3_2024
2 pages
ML Ctanujit
No ratings yet
ML Ctanujit
56 pages
Recognition Patterns: Jean Carlo Grandas Franco March 2020
No ratings yet
Recognition Patterns: Jean Carlo Grandas Franco March 2020
9 pages
STAT-36700 Homework 4 - Solutions: Fall 2018 September 28, 2018
No ratings yet
STAT-36700 Homework 4 - Solutions: Fall 2018 September 28, 2018
14 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
HW 4
No ratings yet
HW 4
5 pages
Inference Quals 1992-2019
No ratings yet
Inference Quals 1992-2019
66 pages
HW2
No ratings yet
HW2
4 pages
Sample Midterm With Solutions (Updated)
No ratings yet
Sample Midterm With Solutions (Updated)
26 pages
Stats 2 Week 5 8 Paga
No ratings yet
Stats 2 Week 5 8 Paga
76 pages
Concepts_in_Deep_Learning_Solutions_v1.0
No ratings yet
Concepts_in_Deep_Learning_Solutions_v1.0
110 pages
Assign20153 Sol
No ratings yet
Assign20153 Sol
47 pages
DAMA_50_exam_resit_22-23
No ratings yet
DAMA_50_exam_resit_22-23
11 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
(Probability Theory and Stochastic Modelling 103) Zenghu Li - Measure-Valued Branching Markov Processes-Springer-Verlag GMBH (2023)
No ratings yet
(Probability Theory and Stochastic Modelling 103) Zenghu Li - Measure-Valued Branching Markov Processes-Springer-Verlag GMBH (2023)
481 pages
N4-Electrotechnics ICASS 01
No ratings yet
N4-Electrotechnics ICASS 01
7 pages
Bearing Repeater Compass
No ratings yet
Bearing Repeater Compass
4 pages
Lesson 2-Flexural Analysis of Beams
No ratings yet
Lesson 2-Flexural Analysis of Beams
15 pages
Grade 9 Biology Unit 1 Part 1 Question
No ratings yet
Grade 9 Biology Unit 1 Part 1 Question
8 pages
Investigating Grip Study of Different Types of Footwear Soiling Materials in SATRA STD
No ratings yet
Investigating Grip Study of Different Types of Footwear Soiling Materials in SATRA STD
3 pages
MATH1A03 Midterm 2011F Practice
No ratings yet
MATH1A03 Midterm 2011F Practice
7 pages
M.Tech 1st Sem Machine Degine Syllabus
No ratings yet
M.Tech 1st Sem Machine Degine Syllabus
12 pages
Aits 2223 PT Ii Jeem
No ratings yet
Aits 2223 PT Ii Jeem
18 pages
Logarithms
No ratings yet
Logarithms
5 pages
P-27-08 13th Objective
No ratings yet
P-27-08 13th Objective
8 pages
Mana Paribhsha RSBK
No ratings yet
Mana Paribhsha RSBK
31 pages
Astmd 6648 - 08 (2016)
No ratings yet
Astmd 6648 - 08 (2016)
15 pages
Astm C191 08
No ratings yet
Astm C191 08
6 pages
CE135 - 7. Torsion
No ratings yet
CE135 - 7. Torsion
34 pages
Exercises
No ratings yet
Exercises
17 pages
Lab 08 Meen201101065
No ratings yet
Lab 08 Meen201101065
5 pages
Fi Eu Vs VOb Ew 9 PFZZ Aeyx
No ratings yet
Fi Eu Vs VOb Ew 9 PFZZ Aeyx
16 pages
6 Stadia Levelling
No ratings yet
6 Stadia Levelling
2 pages
Instant Access to Data Compression in Spectroscopy 1st Edition Joseph Dubrovkin ebook Full Chapters
100% (2)
Instant Access to Data Compression in Spectroscopy 1st Edition Joseph Dubrovkin ebook Full Chapters
50 pages
Solid Dosage Form-Physical Tests
No ratings yet
Solid Dosage Form-Physical Tests
34 pages
Impact Damage On A Thin Glass Plate With A Thin Polycarbonate Backing
No ratings yet
Impact Damage On A Thin Glass Plate With A Thin Polycarbonate Backing
15 pages
Compound Characterization Checklist Form (Orlef7 - CCC)
No ratings yet
Compound Characterization Checklist Form (Orlef7 - CCC)
6 pages
Concept Strengthening Sheet (CSS-03) - RM - Physics - 220725 - 192301
No ratings yet
Concept Strengthening Sheet (CSS-03) - RM - Physics - 220725 - 192301
4 pages
Chapter 1
No ratings yet
Chapter 1
58 pages
1345675907-Assignment-1 Machine Design
No ratings yet
1345675907-Assignment-1 Machine Design
2 pages
0580 Formula at 2023
100% (1)
0580 Formula at 2023
55 pages
C Compression Connectors
No ratings yet
C Compression Connectors
84 pages
Permanent PKR Eline/Hydraulic Setting Tool Signature: Advantages
No ratings yet
Permanent PKR Eline/Hydraulic Setting Tool Signature: Advantages
18 pages
Tavares Et Al. - 2020 - Adapting A Breakage Model To Discrete Elements Using Polyhedral Particles
No ratings yet
Tavares Et Al. - 2020 - Adapting A Breakage Model To Discrete Elements Using Polyhedral Particles
13 pages

DGM 2023 Endterm Solution

Uploaded by

DGM 2023 Endterm Solution

Uploaded by

Data Analytics & Machine Learning

Advanced Machine Learning: Deep Generative Models

• The total amount of achievable credits in this exam is 28 credits.

• No other material (e.g. books, cell phones, calculators) is allowed!

• All sheets (including scratch paper) have to be returned at the end.

Left room from to / Early submission at

0 a) Let x ∈ R2 and the transformation is defined as follows:

where a ∈ R1>×02 and σ is the element-wise sigmoid activation.

same z by replacing x1 with −x1 .

Pr(T (ϵ) ≤ a) = Pr(ϵ ≤ T −1 (a))

T −1 (a) = ln(a) ⇒ T (ϵ) = exp(ϵ)

Pr(Mµ,σ2 (z) ≤ a) = Pr(z ≤ Pr(Mµ−,1σ2 (a))

Hint: Use the results from the previous subproblems.

Here, KL is the Kullback-Leibler divergence KL p(z)∥q(z) = p(z) log p(z)

a) Based on this training objective, explain in one sentence each 0

• the meaning of the first expected value E1 , 1

• the meaning of the second expected value E2 , 2

• and what is adversarial about this formulation. 3

where KL is the Kullback-Leibler (KL) divergence KL(p ∥q) = Ep log pq .

Hint: Remember the general form of the optimal discriminator.

The optimal discriminator is given by

L = max 12 ∗E [log Dϕ (x)] + 1

Now plug in the optimal discriminator.

= JSD(p ∗ , pθ ) − log(2) (3.7)

c) Why does model training fail if βn = 1?

by 0 = 0 when estimating x 0 from z n .

You might also like