0% found this document useful (0 votes)

127 views

Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran

This document summarizes the solutions to 8 questions from an assignment on machine learning. 1) The first question asks which logical function a simple neural network computes based on its structure and activation thresholds. 2) The second question asks for the partial derivative of the loss function with respect to a weight parameter in a neural network being trained with backpropagation. 3) The third question asks for the value of the weight parameter after one update step of backpropagation training, given the learning rate and other parameters from question 2.

Uploaded by

noor ali

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views

Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran

Uploaded by

noor ali

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Assignment 5 (Sol.

)
Introduction to Machine Learning
Prof. B. Ravindran
1. You are given the following neural networks which take two binary valued inputs x1 , x2 ∈ {0, 1}
and the activation function is the threshold function(h(x) = 1 if x > 0; 0 otherwise). Which
of the following logical functions does it compute?

Figure 1: Q1

(a) OR
(b) AND
(c) NAND
(d) None of the above.
Solution: B
You can construct the truth table and see the values and decide which gate the network mimics.
0 0 0
0 1 0
1 0 0
1 1 1

2. We have a function which takes a two-dimensional input x = (x1 , x2 ) and has two parameters
w = (w1 , w2 ) given by f (x, w) = σ(σ(x1 w1 )w2 + x2 ) where σ(x) = 1+e1−x . We use backprop-
agation to estimate the right parameter values. We start by setting both the parameters to

1
0. Assume that we are given a training point x1 = 1, x2 = 0, y = 5. Given this information
∂f
answer the next two questions. What is the value of ∂w 2
?
(a) 0.5
(b) -0.25
(c) 0.125
(d) -0.5
Solution: C
Write σ(x1 w1 )w2 + x2 as o2 and x1 w1 as o1

∂f ∂f ∂o2
=
∂w2 ∂o2 ∂w2
∂f
= σ(o2 )(1 − σ(o2 )) × σ(o1 )
∂w2
∂f
= 0.5 ∗ 0.5 ∗ 0.5
∂w2
3. If the learning rate is 0.5, what will be the value of w2 after one update using backpropagation
algorithm?
(a) 0.0625
(b) -0.0625
(c) 0.5625
(d) - 0.5625
Solution: C
The update equation would be
∂L
w2 = w2 − λ
∂w2
where L is the loss function, here L = (y − f )2

∂f
w2 = w2 − λ × 2(y − f ) × (−1) ×
∂w2

Now putting in the given values we get the right answer.

4. Given N samples x1 , x2 , . . . , xN drawn independently from a Gaussian distribution with vari-
ance σ2 and unknown mean µ, find the MLE of the mean.
PN
xi
(a) µM LE = i=1
σ2
PN
i=1 xi
(b) µM LE = 2σ 2 N
PN
i=1 xi
(c) µM LE = N
PN
i=1 xi
(d) µM LE = N −1

2
Solution C
We will write the log likelihood as the following,
X 1 (xi −µ)2
L= log( √ e 2σ2 )
i
σ 2π

X (xi − µ)2
L=K+
i
2σ 2
∂L
Now we need to maximize this L, which we do by setting ∂µ to 0, which gives us option C as
the solution.
5. Continuing with the above question, assume that the prior distribution of the mean is also a
Gaussian distribution, but with parameters mean µp and variance σp2 . Find the MAP estimate
of the mean.
σ 2 µp +σp2 N
P
i=1 xi
(a) µM AP = σ +N σp2
2

σ 2 +σp2 N
P
i=1 xi
(b) µM AP = σ +σp2
2

σ 2 +σp2 N
P
i=1 xi
(c) µM AP = σ +N σp2
2

σ 2 µp +σp2 N
P
i=1 xi
(d) µM AP = N (σ +σp2 )
2

Solution C
For a MAP estimate, we try to maximize f (µ)f (X|µ)
(µ−µp )2
1 Y 1 (xi −µ)2
2
f (µ)f (X|µ) = √ e 2σp
√ e 2σ2
σp 2π i
σ 2π

We will maximize this with respect to µ after taking a logarithm. This will yield the following
equation, P
i xi µp N 1
+ − µ( + ) = 0
σ σp σ σp
Thus solution will be C

6. Which among the following statements is (are) true?

(a) MAP estimates suffer more from overfitting than maximum likelihood estimates.
(b) MAP estimates are equivalent to the ML estimates when the prior used in the MAP is a
uniform prior over the parameter space.
(c) One drawback of maximum likelihood estimation is that in some scenarios (hint: multi-
nomial distribution), it may return probability estimates of zero.
(d) The parameters which minimize the expected Bayesian L1 Loss is the median of the
posterior distribution.

Solution - B, C, D

3
7. Using the notations used in class and the tutorial document, evaluate the value of the neural
network with a 3-3-1 architecture (2-dimensional input with 1 node for the bias term in both
the layers). The parameters are as follows

1 0.2 0.4
α=
−1 0.3 0.5

β = 0.3 0.4 0.5
Using sigmoid function as the activation functions at both the layers, the output of the network
for an input of (0.8, 0.7) will be
(a) 0.6710
(b) 0.6617
(c) 0.6948
(d) 0.3369
Solution C
This is a straight forward computation task. First pad x with 1 and make it the X vector,
 
1
X = 0.8
0.7

The output of the first layer can be written as

o1 = αX

Next apply the sigmoid function and compute

1
a1 (i) =
1 + e−o1 (i)
Then pad the a1 vector also with 1 for bias, then compute the output of the second layer.

o2 = βa1
1
a2 =
1 + e−o2
a2 = 0.6948

8. Which of the following statements is/are true about Neural Networks?

(a) Neural Networks can model arbitrarily complex decision boundaries.

(b) Neural Networks can be used to emulate a Gaussian kernel SVM
(c) Training of a neural network is very sensitive to the initial weights.
(d) Ideal initialization for weights would be setting all of them to zeros

4
Solution A, B, C
A - Neural networks are also called as universal approximators, because of their ability to learn
complex functions by varying the number of layersPand nodes.
h
B - The decision from any SVM is given by ŷ = ( i=0 αK(xi , x) + b) where xi represent the
Support Vectors and K is the gaussian kernel. This can be implemented using a RBF-Neural
Network. The first layer would be the input layer. Second layer would be the radial basis
nodes, with as many nodes as support vectors in the SVM. And a single node in the final
layer. The centers of the gaussian basis functions would be the support vectors of the SVM.
The would be same as that of the kernel. The weights connected the hidden layer to the last
layer would be given by i and a bias b. The activation function for the last layer would be the
sgn function. C This is true because bad initializations might hinder the learning of the neural
network, for example if you use all zeros the network might not be able to learn anything
because of zero gradients.

HW: Show That The Output Voltage For A Wheat Stone Bridge With A Strain Gauge Which Has Identical Platinum Resistors Is Equal To
100% (1)
HW: Show That The Output Voltage For A Wheat Stone Bridge With A Strain Gauge Which Has Identical Platinum Resistors Is Equal To
1 page
VTP Configuration On GNS3 With NM
100% (1)
VTP Configuration On GNS3 With NM
3 pages
Service Manual: Skyworth RGB R&D Centre Technical Document
100% (3)
Service Manual: Skyworth RGB R&D Centre Technical Document
35 pages
Service Bulletin Trucks: Special Tools, Group 2
No ratings yet
Service Bulletin Trucks: Special Tools, Group 2
46 pages
AtiB_week_7_ga
No ratings yet
AtiB_week_7_ga
8 pages
553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21
No ratings yet
553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21
5 pages
Main
No ratings yet
Main
21 pages
Quiz Sol
No ratings yet
Quiz Sol
3 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
PINNs_CKadapa_1733587839
No ratings yet
PINNs_CKadapa_1733587839
20 pages
02 Mathematics PDF
No ratings yet
02 Mathematics PDF
30 pages
Legendre Functions
No ratings yet
Legendre Functions
10 pages
Lecture RandomizedLA
No ratings yet
Lecture RandomizedLA
6 pages
Final Exam Key 08
No ratings yet
Final Exam Key 08
7 pages
Handout 14: Unbiasedness and MSE
No ratings yet
Handout 14: Unbiasedness and MSE
3 pages
DPP 39
No ratings yet
DPP 39
4 pages
Slides 535 Day 5 SPR 2014
No ratings yet
Slides 535 Day 5 SPR 2014
13 pages
mcmc
No ratings yet
mcmc
76 pages
2018_sol
No ratings yet
2018_sol
6 pages
R project (1)
No ratings yet
R project (1)
5 pages
Exam 170824
No ratings yet
Exam 170824
3 pages
ODE Tutorial 04 Solutions
No ratings yet
ODE Tutorial 04 Solutions
7 pages
Week 12 - GS
No ratings yet
Week 12 - GS
5 pages
2013 Senior Solution
No ratings yet
2013 Senior Solution
2 pages
Math 2280 - Practice Final Exam: University of Utah Spring 2013
No ratings yet
Math 2280 - Practice Final Exam: University of Utah Spring 2013
24 pages
Test-5-IN - Engg. Mathematics PDF
No ratings yet
Test-5-IN - Engg. Mathematics PDF
16 pages
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
No ratings yet
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
21 pages
For Physics Answer
No ratings yet
For Physics Answer
3 pages
6 (Cosx) Cos Cos X
No ratings yet
6 (Cosx) Cos Cos X
7 pages
Lec 12
No ratings yet
Lec 12
7 pages
MSexam Stat 2016S Solution
No ratings yet
MSexam Stat 2016S Solution
11 pages
APznzaZLCkuh0OuyUsf7xYGpHsSSJxTjZKm3hTtIRgQm1lLCHgrKe7QlaMkyPDcO3mZzFHj42R9DlrZ
No ratings yet
APznzaZLCkuh0OuyUsf7xYGpHsSSJxTjZKm3hTtIRgQm1lLCHgrKe7QlaMkyPDcO3mZzFHj42R9DlrZ
8 pages
Midterm Solution
No ratings yet
Midterm Solution
11 pages
456 assignment on probability
No ratings yet
456 assignment on probability
8 pages
Sol_8762b0748ea54850762259a560ec26e6
No ratings yet
Sol_8762b0748ea54850762259a560ec26e6
34 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
Ellllip Curvs09tyz
No ratings yet
Ellllip Curvs09tyz
2 pages
WK5-Ch4-M3-Solution of Three-Dimensional Laplace's Equation by Separation of Variables Method200626060606063939
No ratings yet
WK5-Ch4-M3-Solution of Three-Dimensional Laplace's Equation by Separation of Variables Method200626060606063939
8 pages
Exam_2023-2024
No ratings yet
Exam_2023-2024
2 pages
Sample Exam Answers
No ratings yet
Sample Exam Answers
6 pages
HW 1 Sol
No ratings yet
HW 1 Sol
3 pages
Calculus III: Practice Problems: Some Trigonometric Identities
No ratings yet
Calculus III: Practice Problems: Some Trigonometric Identities
2 pages
finalSolutions
No ratings yet
finalSolutions
5 pages
Deep Learning Assignment3 Solution
No ratings yet
Deep Learning Assignment3 Solution
9 pages
Chap2 Multivariate Normal and Related Distributions
No ratings yet
Chap2 Multivariate Normal and Related Distributions
18 pages
SF2822 Applied Nonlinear Optimization, Final Exam Friday May 31 2019 8.00-13.00 Brief Solutions
No ratings yet
SF2822 Applied Nonlinear Optimization, Final Exam Friday May 31 2019 8.00-13.00 Brief Solutions
3 pages
Analysis Rice
No ratings yet
Analysis Rice
60 pages
Definit Integral
No ratings yet
Definit Integral
5 pages
Tentalosning TMA947 070312 2
No ratings yet
Tentalosning TMA947 070312 2
6 pages
Edexcel - As Levels Maths
No ratings yet
Edexcel - As Levels Maths
4 pages
5
No ratings yet
5
45 pages
HW 2 Solutions
No ratings yet
HW 2 Solutions
6 pages
HDT_SOP_Report__1___Copy___Copy_ (1)
No ratings yet
HDT_SOP_Report__1___Copy___Copy_ (1)
19 pages
BE Formes Et Contours
No ratings yet
BE Formes Et Contours
7 pages
Analytic Number Theory Note
No ratings yet
Analytic Number Theory Note
36 pages
STAT456 Study Guide
No ratings yet
STAT456 Study Guide
31 pages
Hopf Bifurcation For Flows
No ratings yet
Hopf Bifurcation For Flows
5 pages
CIE-2 Solutions
No ratings yet
CIE-2 Solutions
10 pages
SEP OrdDiff HS2021 Solution
No ratings yet
SEP OrdDiff HS2021 Solution
5 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Sat Comp0142
No ratings yet
Sat Comp0142
4 pages
tests_TD1
No ratings yet
tests_TD1
4 pages
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Traffic Light Using Arduino Uno and Labview: Mihai Bogdan
No ratings yet
Traffic Light Using Arduino Uno and Labview: Mihai Bogdan
5 pages
1 Notes: Vreps Bubble Rob Tutorial
No ratings yet
1 Notes: Vreps Bubble Rob Tutorial
13 pages
How Do I Load The Labview Interface For Arduino Firmware Onto My Arduino Uno - National Instruments
No ratings yet
How Do I Load The Labview Interface For Arduino Firmware Onto My Arduino Uno - National Instruments
6 pages
Chapter Two D.C. Motors: Fleming's Left Hand Rule
No ratings yet
Chapter Two D.C. Motors: Fleming's Left Hand Rule
20 pages
Unesco - Eolss Sample Chapters: System Characteristics: Stability, Controllability, Observability
No ratings yet
Unesco - Eolss Sample Chapters: System Characteristics: Stability, Controllability, Observability
10 pages
Fuzzy Set
No ratings yet
Fuzzy Set
43 pages
4 Neural Network
No ratings yet
4 Neural Network
74 pages
Language Processing System in Compiler Design: Difficulty Level: Last Updated: 22 Feb, 2021
No ratings yet
Language Processing System in Compiler Design: Difficulty Level: Last Updated: 22 Feb, 2021
54 pages
Introduction To Computing Reviewer
No ratings yet
Introduction To Computing Reviewer
14 pages
Product Brochures
No ratings yet
Product Brochures
16 pages
Motherboard Manual Bx2000 1101 e
No ratings yet
Motherboard Manual Bx2000 1101 e
44 pages
Theory of Machines
No ratings yet
Theory of Machines
35 pages
Porous Organic Materials
No ratings yet
Porous Organic Materials
1 page
NA2 XFGB Y
No ratings yet
NA2 XFGB Y
4 pages
Module 5-FS
No ratings yet
Module 5-FS
21 pages
Apfs Stats
No ratings yet
Apfs Stats
11 pages
MCE IGCSE Chemistry SB C18 - Full Solutions
No ratings yet
MCE IGCSE Chemistry SB C18 - Full Solutions
5 pages
04 - 05 - Option Strategies & Payoff's
No ratings yet
04 - 05 - Option Strategies & Payoff's
66 pages
Alcalá Et Al (2022)
No ratings yet
Alcalá Et Al (2022)
12 pages
BF 00814949
No ratings yet
BF 00814949
5 pages
A Fully Integrated Multi-Channel Impedance Extraction Circuit For Biosensor Arrays
No ratings yet
A Fully Integrated Multi-Channel Impedance Extraction Circuit For Biosensor Arrays
21 pages
Oracle: Questions & Answers
0% (1)
Oracle: Questions & Answers
5 pages
Creatinine ARC CHEM PDF
No ratings yet
Creatinine ARC CHEM PDF
8 pages
United States Patent
No ratings yet
United States Patent
7 pages
Lesson Guide 4 - Book 4 - Comprehension of Multiplication v0.2 PDF
No ratings yet
Lesson Guide 4 - Book 4 - Comprehension of Multiplication v0.2 PDF
52 pages
Econs 503 - Advanced Microeconomics Ii Handout On Bayesian Nash Equilibrium
No ratings yet
Econs 503 - Advanced Microeconomics Ii Handout On Bayesian Nash Equilibrium
6 pages
Medical Physics Guidelines-1
No ratings yet
Medical Physics Guidelines-1
11 pages
LBYCH1A Laboratory Manual v2024.01
No ratings yet
LBYCH1A Laboratory Manual v2024.01
18 pages
Jee - Calculus Formula
100% (1)
Jee - Calculus Formula
153 pages
HY08AC Tempio Thermostat Manual
No ratings yet
HY08AC Tempio Thermostat Manual
2 pages
Module - 1
No ratings yet
Module - 1
53 pages
01 Bcon141 Mod3 Lesson6 LinearProgramming
No ratings yet
01 Bcon141 Mod3 Lesson6 LinearProgramming
86 pages
Core Java & J2EE Schedule: Multithreading
No ratings yet
Core Java & J2EE Schedule: Multithreading
2 pages
ITC 201 CSITA (Detailed Course) TU BBA
No ratings yet
ITC 201 CSITA (Detailed Course) TU BBA
3 pages

Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran

Uploaded by

Assignment 5 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran

Uploaded by

Assignment 5 (Sol.

Now putting in the given values we get the right answer.

6. Which among the following statements is (are) true?

The output of the first layer can be written as

Next apply the sigmoid function and compute

8. Which of the following statements is/are true about Neural Networks?

(a) Neural Networks can model arbitrarily complex decision boundaries.

You might also like