100% found this document useful (2 votes)
875 views

DL - Assignment 10 Solution

This document contains a 10 question multiple choice quiz on deep learning concepts like batch normalization, data augmentation, regularization, and their applications. The questions cover topics such as calculating mean and variance in batch normalization layers, the effect of bias terms, visualization of training and validation errors, hyperparameters optimized by batch normalization, and regularization techniques like dropout, weight decay, and L1/L2 regularization.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
875 views

DL - Assignment 10 Solution

This document contains a 10 question multiple choice quiz on deep learning concepts like batch normalization, data augmentation, regularization, and their applications. The questions cover topics such as calculating mean and variance in batch normalization layers, the effect of bias terms, visualization of training and validation errors, hyperparameters optimized by batch normalization, and regularization techniques like dropout, weight decay, and L1/L2 regularization.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

NPTEL Online Certification Courses

Indian Institute of Technology Kharagpur

Deep Learning
Assignment- Week 10
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________

QUESTION 1:
A neural network has 3 neurons in a hidden layer. Activations of the neurons for three batches
1 0 6
are[2] , [2] , [9] respectively. What will be the value of mean if we use batch normalization in
3 5 2
this layer?
2.33
a. [4.33]
3.33
2.00
b. [2.33]
5.66
1.00
c. [1.00]
1.00
0.00
d. [0.00]
0.00
Correct Answer: a

Detailed Solution:
1 1 0 6 2.33
× ([2] + [2] + [9]) = [4.33]
3
3 5 2 3.33

______________________________________________________________________________

QUESTION 2:
Given two Neural Networks, Neural Networks A (NNA) and Neural Networks B (NNB). These
Neural Networks accept 64D vector as input, their configuration is as follows

NNA: Fully Connected Layer(128 Neurons + With Bias)->Batch Norm (128D)->ReLU


NNB: Fully Connected Layer(128 Neurons + Without Bias)->Batch Norm(128D)->ReLU
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

If both Neural Networks have same weights for all layers (except for absent bias term in neural
network B). Now a batch of ten 64D vector is applied to both Neural Network A and B, choose
the following

a. Output of NNA is different from that of NNB as there is no Bias vector in FC layer
of NNB
b. Output of NNA and NNB is same
c. It is indeterminable whether outputs will be same or different
d. None of the above

Correct Answer: b
Detailed Solution:

The bias term gets cancelled out due to batch normalization layer, hence the output is
same. i.e. BatchNorm(Wx+b) = BatchNorm(Wx), as mean(Wx+b) = mean(Wx)+b and
standard deviation isn’t affected by addition of a constant to population.

QUESTION 3:

While training a neural network for image recognition task, we plot the graph of training error
and validation error. Which is the best for early stopping?

a. A
b. B
c. C
d. D

Correct Answer: c

Detailed Solution:
Minimum validation point is the best for early stopping.
______________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

QUESTION 4:
Which among the following is NOT a data augmentation technique?

a. Random horizontal and vertical flip of image


b. Random shuffle all the pixels of an image
c. Random color jittering
d. All the above are data augmentation techniques

Correct Answer: b

Detailed Solution:
Random shuffle of all the pixels of the image will distort the image and neural network will be
unable to learn anything. So, it is not a data augmentation technique.
______________________________________________________________________________

QUESTION 5:
Batch Normalization is helpful because

a. It normalizes all the input before sending it to the next layer


b. It returns back the normalized mean and standard deviation of weights
c. It is a very efficient back-propagation technique
d. None of these

Correct Answer: a

Detailed Solution:
Batch normalization layer normalizes the input.

______________________________________________________________________________

QUESTION 6:
A Batch Norm layer accepts batch of 128D vector. How many parameters of Batch norm get
trained via backpropagation during the course of training

a. 256
b. 512
c. 128
d. 1024

Correct Answer: a
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Detailed Solution:

Both Gamma and Beta are applied channel wise and thus are 128 dimensional each.

____________________________________________________________________________

QUESTION 7:
Which of the following is a regularization method?

a. Data augmentation
b. Dropout
c. Weight decay
d. All of the above

Correct Answer: d

Detailed Solution:

Regularization is a modification to a learning algorithm that is intended to reduce its


generalization error but not its training error. All the methods fit that definition.

______________________________________________________________________________

QUESTION 8:
Which one of the following regularization methods induces sparsity among the trained
weights?

a. 𝐿1 regularizer
b. 𝐿2 regularizer
c. Both 𝐿1 & 𝐿2
d. None of the above

Correct Answer: a
Detailed Solution:

https://developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-
regularization

____________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

QUESTION 9:
How do we generally calculate mean and variance during testing?

a. Batch normalization is not required during testing


b. Mean and variance based on test image
c. Estimated mean and variance statistics during training
d. None of the above

Correct Answer: c

Detailed Solution:
We generally calculate batch mean and variance statistics during training and use the estimated
batch mean and variance during testing.
____________________________________________________________________________

QUESTION 10:
Two variant training schedulesamples its minibatches in the following manner
Training Schedule 1

Mini batch 1=[Image1, Image2, Image3]


Mini batch 2=[Image4, Image5, Image6]

Training Schedule 2

Mini batch 1=[Image1, Image4 , Image3]


Mini batch 2=[Image2, Image 5, Image6]

The output activations of each corresponding image is compared across Training schedule 1 and
Training schedule 2 for a CNN with batch norm layers. Choose the correct statement

a. Activation outputs of corresponding image will be same across Training schedule


1 and Training schedule 2
b. Activation outputs of corresponding image will be different across Training
schedule 1 and Training schedule 2
c. Some activations outputs of corresponding images will be same but some will be
different
d. None of these.

Correct Answer: b
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Detailed Solution:

As minibatch constructed are different across training scheme, The mini-batch statistics will
also be different and therefore activation outputs of corresponding images will be different.

____________________________________________________________________________

______________________________________________________________________

______________________________________________________________________________

************END*******

You might also like