0% found this document useful (0 votes)

1 views

exercise0

The document is an exercise sheet for a Deep Reinforcement Learning course, containing voluntary exercises aimed at practicing math and machine learning concepts. It includes problems on Taylor expansion, critical points, probability distributions, variance estimates, and implementing a CNN for MNIST classification. Solutions to the exercises are available on Brightspace, but submission is not required.

Uploaded by

Meme Necromancer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

exercise0

Uploaded by

Meme Necromancer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

DSAIT4115 Deep Reinforcement Learning Exercise Sheet 0

Wendelin Böhmer <[email protected]> voluntary exercises

Math and machine learning primer

Voluntary exercises

The following exercises do not have to be submitted as homework, but might be helpful to practice
the required math and prepare for the exam. Some questions are from old exams and contain the used
rubrik. You do not have to submit these questions and will not receive points for them. Solutions are
available on Brightspace.

E0.1: Taylor expansion (voluntary)

√
For the function 1 + x, write down the Taylor series around x0 = 0 up to 3rd order.

E0.2: Critical points (voluntary)

Consider the two functions

f (x, y) := c + x2 + y 2
g(x, y) := c + x2 − y 2 ,
where c ∈ R is a constant.
(a) Show that a = (0, 0) is a critical point of both functions.
(b) Check for f and for g whether a is a minimum, maximum, or saddlepoint using the Hessian matrix.
Hint: A matrix is positive (negative) definite if and only if all its eigenvalues are positive (negative).

E0.3: Distributions and expected values (voluntary)

Let x ∈ R be a random variable with probability density p : R → R with:

c · sin(x), x ∈ [0, π]
p(x) =
0, elsewhere

(a) Determine the parameter c ∈ R such that p(x) is indeed a probability density.
(b) Determine the expected value µ := Ep [x]
(c) Determine the variance of x, Ep [(x − µ)2 ].

1
DSAIT4115 Deep Reinforcement Learning Exercise Sheet 0

E0.4: Variance of the empirical mean (old exam question) (voluntary)

Prove that the variance of the empirical mean fn := n1 ni=1 xi , based on n samples xi ∈ R drawn
P
2
i.i.d. from the Gaussian distribution N (µ, σ 2 ), is V[fn ] = σn , without using the fact the variance of a
sum of independent variables is the sum of the variables’ variances.

E0.5: Unbiased variance estimate (voluntary)

Let {xi }ni=1 be a data set that is drawn i.i.d. from the Gaussian distribution xi ∼ N (µ, σ 2 ). Let further
µ̂ := n i=1 xi denote the empirical mean and σ̂ 2 := n ni=1 (xi − µ̂)2 the equivalent empirical
1 Pn 1 P
variance. Prove analytically that µ̂ is unbiased, i.e. E[µ̂] = µ, and that σ̂ 2 is biased, i.e. E[σ̂ 2 ] ̸= σ 2 .
Bonus-question: Can you derive an unbiased estimator for the empirical variance?
Hint: If xi and xj are drawn i.i.d. from N (µ, σ 2 ), then holds ∀i:

E[xi ] = µ , E[(xi − µ)2 ] = σ 2 and E[(xi − µ)(xj − µ)] = 0 if i ̸= j .

E0.6: Maximum dice (voluntary)

This question is designed to practice the use of Kronecker-delta functions and become more familiar
with (discrete) probabilities. You are given 3 dice, a D6, a D8 and a D10, where Dx refers to a x-
sided fair dice, where each of the x sides is numbered uniquely 1 to x and rolled with the exact same
probability.
(a) Prove analytically that the probability that the D6 is among the highest (including equal) numbers
when all 3 dice are rolled together is roughly ρ ≈ 19%.
(b) Prove analytically that the probability that the D8 rolls among the highest is ρ′ ≈ 38%.
(c) Prove analytically that the probability that the D10 rolls among the highest is ρ′′ ≈ 58%.
Hint: You can solve the question however you want, but you are encouraged to use Kronecker-deltas,
e.g. δ(i > 5) is 1 if i > 5 and 0 otherwise. You will find that this can simplify complex sums
(1) 2 (2)
enormously. If you do so, you can use the equalities ni=1 i = n 2+n and ni=1 i2 = n(n+1)(2n+1)
P P
6 .
Bonus-question: Why don’t the above numbers sum up to 1?

E0.7: Implement MNIST classification (voluntary)

Implement the MNIST classification example from the lecture slides. Make sure you get the correct
deep CNN model architecture from Lecture 2 (p.18).
(a) Train the model fθ : R28×28 → R10 from the lecture slides with a cross-entropy loss for 10 epochs.
Plot the average train/test losses during each epoch (y-axis) over all epochs (x-axis). Do the same
with the average train/test accuracies during each epoch. Try to program as modular as possible, as
you will re-use the code later.

2
DSAIT4115 Deep Reinforcement Learning Exercise Sheet 0

(b) Change your optimization criterion to a mean-squared-error loss between the same model architec-
ture fθ : R28×28 → R10 you used in (a) and a one-hot encoding (hi ∈ R10 , hij = 1 iff j = yi ,
otherwise hij = 0) of the labels yi :
n 2
1P
L := n fθ (xi ) − hi
i=1

Plot the same plots as in (a). Try to reuse as much of your old code as possible, e.g., by defining
the criterion (which is now different) as external functions that can be overwritten.
(c) Define a new architecture fθ′ : R28×28 → R, that is exactly the same as above, but with only one
output neuron instead of 10. Train it with a regression mean-squared error loss between the model
output and the scalar class identifier.
n 2
L′ := 1P
n fθ′ (xi ) − yi
i=1

Plot the same plots as in (a), but for 50 epochs.

(d) Learning in (c) should be significantly slower, in terms of accuracy gain per epoch, than in (a) and
(b). Use a transformation of your model output (which can be implemented in the functions that
compute the criterion and the accuracy, or as an extra module) as fθ′′ (xi ) := αfθ′ (xi ) + β, with
α = β = 4.5. Plot the same plots as in (c). Does the learning behavior change? Why?
Bonus-question: Can you come up with an alternative approach to (d) that has the same speed-up effect?
Hint: Evaluate your test loss and accuracy before every training to make sure the accuracy is defined
correctly (should be around 0.1 for a model without training). This means that you will always have
one test measurement more.
Hint: Try to reuse as much of your old code as possible, e.g., by defining the criterion and the accuracy
(which will change for some question) as external functions that can be overwritten later.

E0.8: Mean and variance of online estimates (voluntary)

Let {yt }∞ 2
t=1 an infinite training set drawn i.i.d. from the Gaussian distribution N (µ, σ ). At time t, the
online estimate ft of the average over the training set, starting at f0 , is defined as

ft = ft−1 + α (yt − ft−1 ) , 0 < α < 1.

(a) Show that for small t the online estimate is biased, i.e., E[ft ] ̸= µ .
(b) Prove that in the limit t → ∞ the online estimate is unbiased, i.e., E[ft ] = µ .
α σ2
(c) Prove that in the limit t → ∞ the variance of the online estimate is E[ft2 ] − E[ft ]2 = 2−α .
t−1
1−rt
rk =
P
Hint: You can use the geometric series 1−r , ∀|r| < 1 .
k=0

α 1
Bonus-question: Prove that for the decaying learning rate αt = 1−(1−α)t holds lim αt = t .
α→0

Hint: You can also use the binomial identity (x + y)t = tk=0 t
xk y t−k .
P
k

3
DSAIT4115 Deep Reinforcement Learning Exercise Sheet 0

E0.9: Noise in linear functions (voluntary)

Let {xi }ni=1 ⊂ Rm denote a set of training samples and {yi }ni=1 ⊂ R the set of corresponding training
labels. We will use the mean squared loss L := n1 i (f (xi ) − yi )2 to learn a function f (xi ) ≈ yi , ∀i.
P

(a) Derive the analytical solution for parameter a ∈ Rm of a linear function f (x) := a⊤ x.
(b) We will now augment the training data by adding i.i.d. noise ϵi ∼ N (0, σ 2 ) ∈ R to the training
labels, i.e. ỹi := yi + ϵi . Show that this does not change the analytical solution of the expected loss
E[L].
(c) Let f denote the function that minimizes L without label-noise, and let f˜ denote the function that
minimizes L with a random noise ϵi added to labels yi (but not the solution of the expected loss
E[L]). Derive the analytical variance E[(f˜(x) − f (x))2 ] of the noisy solution f˜.
(d) We will now augment the training data by adding i.i.d. noise ϵi ∼ N (0, Σ) ∈ Rm to the training
samples: x̃i = xi + ϵi . Derive the analytical solution for parameter a ∈ Rm that minimizes the
expected loss E[L].

Bonus-question: Which popular regularization method is equivalent to (d) and what problem is solved?
Hint: Summarize all training samples into matrix X = [x1 , . . . , xn ]⊤ ∈ Rn×m , all training labels into
vector y = [y1 , . . . , yn ]⊤ ∈ Rn , and denote the noisy versions ỹ ∈ Rn and X̃ ∈ Rn×m .

https://xkcd.com/2343

1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
exercise0_solution
No ratings yet
exercise0_solution
9 pages
Solutions To The Exercises On The Bias-Variance Dilemma
No ratings yet
Solutions To The Exercises On The Bias-Variance Dilemma
8 pages
CS7015 (Deep Learning) : Lecture 8
No ratings yet
CS7015 (Deep Learning) : Lecture 8
86 pages
ML Assignment
No ratings yet
ML Assignment
17 pages
supervised learning
No ratings yet
supervised learning
61 pages
Review Materials 0 8 1
No ratings yet
Review Materials 0 8 1
140 pages
Exam 21
No ratings yet
Exam 21
17 pages
Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
5 pages
Sample Midterm With Solutions (Updated)
No ratings yet
Sample Midterm With Solutions (Updated)
26 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
100% (1)
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
Wa0193.
No ratings yet
Wa0193.
4 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Chapter 08
100% (2)
Chapter 08
202 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
tfm_lichtner_bajjaoui_aisha
No ratings yet
tfm_lichtner_bajjaoui_aisha
18 pages
Selected theoretical aspects of ML and deep learning
No ratings yet
Selected theoretical aspects of ML and deep learning
46 pages
Machine 2020 Jul-Dec Practice 7,8
No ratings yet
Machine 2020 Jul-Dec Practice 7,8
37 pages
Theory of Deep Learning 1652786371
No ratings yet
Theory of Deep Learning 1652786371
118 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
Deep Learning CS60010: Computer Science and Engineering
No ratings yet
Deep Learning CS60010: Computer Science and Engineering
59 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
Machine Learning, Spring 2005
No ratings yet
Machine Learning, Spring 2005
3 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Midterm Solutions For Machine Learning
No ratings yet
Midterm Solutions For Machine Learning
13 pages
2021-exam2-solution
No ratings yet
2021-exam2-solution
11 pages
lecture11
No ratings yet
lecture11
51 pages
LearningTheory
No ratings yet
LearningTheory
19 pages
DGM 2023 Endterm Solution
No ratings yet
DGM 2023 Endterm Solution
12 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
Class14 PDF
No ratings yet
Class14 PDF
29 pages
Weatherwax Epstein Hastie Solution Manual
No ratings yet
Weatherwax Epstein Hastie Solution Manual
147 pages
2022-exam2-solution
No ratings yet
2022-exam2-solution
10 pages
hw1
No ratings yet
hw1
11 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Week 5
No ratings yet
Week 5
3 pages
(Textbook) (Solution) The Elements of Statistical Learning
No ratings yet
(Textbook) (Solution) The Elements of Statistical Learning
147 pages
endsem_ML_makeup_AK-_1_
No ratings yet
endsem_ML_makeup_AK-_1_
7 pages
Deep Learning Assignment3 Solution
No ratings yet
Deep Learning Assignment3 Solution
9 pages
Mavrin 19 A
No ratings yet
Mavrin 19 A
11 pages
3 Evaluation
No ratings yet
3 Evaluation
41 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Notes On Deep Learning Theory
No ratings yet
Notes On Deep Learning Theory
68 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
UDL Errata
No ratings yet
UDL Errata
13 pages
sol3_2016
No ratings yet
sol3_2016
8 pages
Week 2 (1)
No ratings yet
Week 2 (1)
7 pages
hw2_red
No ratings yet
hw2_red
4 pages
2022-23 Second Sem- DRL Mid Sem Regular
No ratings yet
2022-23 Second Sem- DRL Mid Sem Regular
2 pages
CH 7
No ratings yet
CH 7
47 pages
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
No ratings yet
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
3 pages
Assignment2 PDF
No ratings yet
Assignment2 PDF
2 pages
Lecture 03 - Feedforward Networks - 4p
No ratings yet
Lecture 03 - Feedforward Networks - 4p
19 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
FIRST QUARTER SLM WALKTHROUGH OBSERVATIONS - REVIEW OF SLMs
No ratings yet
FIRST QUARTER SLM WALKTHROUGH OBSERVATIONS - REVIEW OF SLMs
41 pages
Arithmetic Operations On Images Using (Bitwise Operations On Binary Images)
No ratings yet
Arithmetic Operations On Images Using (Bitwise Operations On Binary Images)
7 pages
Problem Score Problem Score Problem Score 1 (15 PTS) 4 (12 PTS) 7 (11 PTS) 2 (18 PTS) 5 (12 PTS) 8 (15 PTS) 3 (12 PTS) 6 (12 PTS) Total
No ratings yet
Problem Score Problem Score Problem Score 1 (15 PTS) 4 (12 PTS) 7 (11 PTS) 2 (18 PTS) 5 (12 PTS) 8 (15 PTS) 3 (12 PTS) 6 (12 PTS) Total
10 pages
Formal Homework Assignment 4
90% (10)
Formal Homework Assignment 4
69 pages
Advanced PDE HW1
No ratings yet
Advanced PDE HW1
3 pages
Critical Indicators Mixed Use Buildings Nigeria (2021) 10p (Salami Et Al.)
No ratings yet
Critical Indicators Mixed Use Buildings Nigeria (2021) 10p (Salami Et Al.)
10 pages
Calculating and Using Float
No ratings yet
Calculating and Using Float
10 pages
Week11 Notes
No ratings yet
Week11 Notes
19 pages
CFC 2309 Maths Stats LR - Question Paper
No ratings yet
CFC 2309 Maths Stats LR - Question Paper
7 pages
Reissner-Nordstrom Metric - Gulmammad Mammadov PDF
No ratings yet
Reissner-Nordstrom Metric - Gulmammad Mammadov PDF
4 pages
Assignment 4 - Regression and Interpolation - Attempt Review - Econcordia
No ratings yet
Assignment 4 - Regression and Interpolation - Attempt Review - Econcordia
13 pages
Topic 11 Manual 2023
No ratings yet
Topic 11 Manual 2023
24 pages
Negative
No ratings yet
Negative
4 pages
Variations On The Body, Serres
100% (1)
Variations On The Body, Serres
138 pages
Schaum's Outline of Probability, Random Variables, and Random Processes, Fourth Edition Hwei P. Hsu - eBook PDFinstant download
100% (8)
Schaum's Outline of Probability, Random Variables, and Random Processes, Fourth Edition Hwei P. Hsu - eBook PDFinstant download
56 pages
Chapter 1
100% (2)
Chapter 1
23 pages
Assignment For Project Planning and Scheduling
No ratings yet
Assignment For Project Planning and Scheduling
7 pages
Paper Adaptive@PID@Computed Torque@Control@Robot
No ratings yet
Paper Adaptive@PID@Computed Torque@Control@Robot
10 pages
Tutorial Sheet 1
No ratings yet
Tutorial Sheet 1
3 pages
DPP - Straight Lines
No ratings yet
DPP - Straight Lines
2 pages
0580 IGCSE Mathematics - Useful Formulae
100% (1)
0580 IGCSE Mathematics - Useful Formulae
3 pages
Quantum Computers 2015 PDF
No ratings yet
Quantum Computers 2015 PDF
5 pages
Math, Good Practices en Cuba
No ratings yet
Math, Good Practices en Cuba
2 pages
SEFULL - 2mark 16 Marks With Answers
No ratings yet
SEFULL - 2mark 16 Marks With Answers
140 pages
Solutions Manual Urban Drainage 2nded
No ratings yet
Solutions Manual Urban Drainage 2nded
17 pages
CSC 520 AI 2018 Spring Syllabus
No ratings yet
CSC 520 AI 2018 Spring Syllabus
7 pages
Defining UBC97 Seismic Analysis in Robot - Example
No ratings yet
Defining UBC97 Seismic Analysis in Robot - Example
5 pages
Lecture 7 - 1
No ratings yet
Lecture 7 - 1
10 pages
AISCIENCES - Data Science Cookbook - V0
No ratings yet
AISCIENCES - Data Science Cookbook - V0
244 pages
Research Article: Indian Classical Dance Action Identification and Classification With Convolutional Neural Networks
No ratings yet
Research Article: Indian Classical Dance Action Identification and Classification With Convolutional Neural Networks
11 pages

exercise0

Uploaded by

exercise0

Uploaded by

DSAIT4115 Deep Reinforcement Learning Exercise Sheet 0

Wendelin Böhmer <[email protected]> voluntary exercises

Math and machine learning primer

E0.1: Taylor expansion (voluntary)

E0.2: Critical points (voluntary)

Consider the two functions

E0.3: Distributions and expected values (voluntary)

Let x ∈ R be a random variable with probability density p : R → R with:

E0.4: Variance of the empirical mean (old exam question) (voluntary)

E0.5: Unbiased variance estimate (voluntary)

E[xi ] = µ , E[(xi − µ)2 ] = σ 2 and E[(xi − µ)(xj − µ)] = 0 if i ̸= j .

E0.6: Maximum dice (voluntary)

E0.7: Implement MNIST classification (voluntary)

Plot the same plots as in (a), but for 50 epochs.

E0.8: Mean and variance of online estimates (voluntary)

ft = ft−1 + α (yt − ft−1 ) , 0 < α < 1.

E0.9: Noise in linear functions (voluntary)

You might also like