DL - Assignment 10 Solution
DL - Assignment 10 Solution
Deep Learning
Assignment- Week 10
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________
QUESTION 1:
A neural network has 3 neurons in a hidden layer. Activations of the neurons for three batches
1 0 6
are[2] , [2] , [9] respectively. What will be the value of mean if we use batch normalization in
3 5 2
this layer?
2.33
a. [4.33]
3.33
2.00
b. [2.33]
5.66
1.00
c. [1.00]
1.00
0.00
d. [0.00]
0.00
Correct Answer: a
Detailed Solution:
1 1 0 6 2.33
× ([2] + [2] + [9]) = [4.33]
3
3 5 2 3.33
______________________________________________________________________________
QUESTION 2:
Given two Neural Networks, Neural Networks A (NNA) and Neural Networks B (NNB). These
Neural Networks accept 64D vector as input, their configuration is as follows
If both Neural Networks have same weights for all layers (except for absent bias term in neural
network B). Now a batch of ten 64D vector is applied to both Neural Network A and B, choose
the following
a. Output of NNA is different from that of NNB as there is no Bias vector in FC layer
of NNB
b. Output of NNA and NNB is same
c. It is indeterminable whether outputs will be same or different
d. None of the above
Correct Answer: b
Detailed Solution:
The bias term gets cancelled out due to batch normalization layer, hence the output is
same. i.e. BatchNorm(Wx+b) = BatchNorm(Wx), as mean(Wx+b) = mean(Wx)+b and
standard deviation isn’t affected by addition of a constant to population.
QUESTION 3:
While training a neural network for image recognition task, we plot the graph of training error
and validation error. Which is the best for early stopping?
a. A
b. B
c. C
d. D
Correct Answer: c
Detailed Solution:
Minimum validation point is the best for early stopping.
______________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 4:
Which among the following is NOT a data augmentation technique?
Correct Answer: b
Detailed Solution:
Random shuffle of all the pixels of the image will distort the image and neural network will be
unable to learn anything. So, it is not a data augmentation technique.
______________________________________________________________________________
QUESTION 5:
Batch Normalization is helpful because
Correct Answer: a
Detailed Solution:
Batch normalization layer normalizes the input.
______________________________________________________________________________
QUESTION 6:
A Batch Norm layer accepts batch of 128D vector. How many parameters of Batch norm get
trained via backpropagation during the course of training
a. 256
b. 512
c. 128
d. 1024
Correct Answer: a
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Detailed Solution:
Both Gamma and Beta are applied channel wise and thus are 128 dimensional each.
____________________________________________________________________________
QUESTION 7:
Which of the following is a regularization method?
a. Data augmentation
b. Dropout
c. Weight decay
d. All of the above
Correct Answer: d
Detailed Solution:
______________________________________________________________________________
QUESTION 8:
Which one of the following regularization methods induces sparsity among the trained
weights?
a. 𝐿1 regularizer
b. 𝐿2 regularizer
c. Both 𝐿1 & 𝐿2
d. None of the above
Correct Answer: a
Detailed Solution:
https://developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-
regularization
____________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 9:
How do we generally calculate mean and variance during testing?
Correct Answer: c
Detailed Solution:
We generally calculate batch mean and variance statistics during training and use the estimated
batch mean and variance during testing.
____________________________________________________________________________
QUESTION 10:
Two variant training schedulesamples its minibatches in the following manner
Training Schedule 1
Training Schedule 2
The output activations of each corresponding image is compared across Training schedule 1 and
Training schedule 2 for a CNN with batch norm layers. Choose the correct statement
Correct Answer: b
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Detailed Solution:
As minibatch constructed are different across training scheme, The mini-batch statistics will
also be different and therefore activation outputs of corresponding images will be different.
____________________________________________________________________________
______________________________________________________________________
______________________________________________________________________________
************END*******