Assignment 10 2024
Assignment 10 2024
Deep Learning
Assignment- Week 10
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________
QUESTION 1:
Correct Answer: b
Detailed Solution:
BN uses running exponential stats to estimate mean and standard deviation at test time, whereas
at training time it uses batch mean and standard deviation in it’s mathematical expression
QUESTION 2:
A neural network has 3 neurons in a hidden layer. Activations of the neurons for three batches
4 0 8
are[4] , [2] , [9] respectively. What will be the value of mean if we use batch normalization in
3 5 1
this layer?
4
a. [5]
3
2
b. [2]
5
1
c. [1]
1
0
d. [0]
0
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Correct Answer: a
Detailed Solution:
1 4 0 8 4
× ([4] + [2] + [9]) = [5]
3
3 5 1 3
______________________________________________________________________________
QUESTION 3:
How can we prevent underfitting?
Correct Answer: b
Detailed Solution:
Underfitting happens whenever models has less parameters to capture the data distribution. We
need to increase the parameters, so data can be fitted perfectly well.
______________________________________________________________________________
QUESTION 4:
How do we generally calculate mean and variance during testing?
Correct Answer: c
Detailed Solution:
We generally calculate batch mean and variance statistics during training and use the estimated
batch mean and variance during testing.
______________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 5:
Which one of the following is an advantage of dropout?
a. Regularization
b. Prevent Overfitting
c. Improve Accuracy
d. All of Above
Correct Answer: d
Detailed Solution:
Dropout drops random features during training to prevent overfitting and over reliance on very
few features.
______________________________________________________________________________
QUESTION 6:
Which of the following is True regarding layer normalization and batch normalization?
d. None of these
Correct Answer: a
Detailed Solution:
See the lectures/lecture materials.
______________________________________________________________________________
QUESTION 7:
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Which one of the following regularization methods induces sparsity among the trained
weights?
a. 𝐿1 regularizer
b. 𝐿2 regularizer
c. Both 𝐿1 & 𝐿2
d. None of the above
Correct Answer: a
Detailed Solution:
https://developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-
regularization
______________________________________________________________________________
QUESTION 8:
Which among the following is NOT a data augmentation technique?
Correct Answer: b
Detailed Solution:
Random shuffle of all the pixels of the image will distort the image and neural network will be
unable to learn anything. So, it is not a data augmentation technique.
______________________________________________________________________________
QUESTION 9:
Which of the following is true about model capacity (where model capacity means the ability of
neural network to approximate complex functions)?
Correct Answer: a
Detailed Solution:
Dropout and learning rate has nothing to do with model capacity. If hidden layers increase, it
increases the number of learnable parameter. Therefore, model capacity increases.
______________________________________________________________________________
QUESTION 10:
Two variant training schedule samples its minibatches in the following manner
Training Schedule 1
Training Schedule 2
The output activations of each corresponding image is compared across Training schedule 1 and
Training schedule 2 for a CNN with batch norm layers. Choose the correct statement
Correct Answer: b
Detailed Solution:
As minibatch constructed are different across training scheme, The mini-batch statistics will
also be different and therefore activation outputs of corresponding images will be different.
____________________________________________________________________________
______________________________________________________________________________
************END*******