Deep Learning - week 4
Deep Learning - week 4
(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)
Recap: 2) What are the benefits of using stochastic gradient descent compared to vanilla 1 point
Learning gradient descent?
Parameters:
Guess Work, SGD converges more quickly than vanilla gradient descent.
Gradient SGD is computationally efficient for large datasets.
Descent (unit?
unit=59&lesso
SGD theoretically guarantees that the descent direction is optimal.
n=60) SGD experiences less oscillation compared to vanilla gradient descent.
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 1/6
10/24/24, 10:08 AM Deep Learning - IIT Ropar - - Unit 7 - week 4
Momentum 3) A team has a data set that contains 100 samples for training a feed-forward neural 1 point
based network. Suppose they decided to use the gradient descent algorithm to update the weights.
Gradient Suppose further that they use line search algorithm for the learning rate as follows,
Descent (unit? η = [0.01, 0.1, 1, 2, 10]. How many times do the weights get updated after training the network
unit=59&lesso for 10 epochs? (Note, for each weight update the loss has to decrease)
n=62)
100
Nesterov
Accelerated 5
Gradient 500
Descent (unit?
10
unit=59&lesso
n=63) 50
Tips for
Adjusting
Learning Rate
and
Momentum
(unit?
unit=59&lesso
n=65)
Line Search
(unit?
unit=59&lesso
n=66)
Lecture
5) What is the advantage of using mini-batch gradient descent over batch gradient 1 point
Material for
descent?
Week 4 (unit?
unit=59&lesso
Mini-batch gradient descent is more computationally efficient than batch gradient
n=69)
descent.
Week 4 Mini-batch gradient descent leads to a more accurate estimate of the gradient than batch
Feedback
gradient descent.
Form: Deep
Learning - IIT Mini batch gradient descent gives us a better solution.
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 2/6
10/24/24, 10:08 AM Deep Learning - IIT Ropar - - Unit 7 - week 4
Ropar (unit? Mini-batch gradient descent can converge faster than batch gradient descent.
unit=59&lesso
n=187) Yes, the answer is correct.
Score: 1
Quiz: Week 4 Accepted Answers:
: Assignment Mini-batch gradient descent is more computationally efficient than batch gradient descent.
4 Mini-batch gradient descent can converge faster than batch gradient descent.
(assessment?
name=288) 6) Which of the following represents the contour plot of the function f(x,y) = x 2 − y 2 ? 1 point
Week 5 ()
Week 6 ()
Week 7 ()
Week 8 ()
Week 9 ()
week 10 ()
Week 11 ()
Week 12 ()
Download
Videos ()
Books ()
Text
Transcripts
()
Problem
Solving
Session -
July 2024 ()
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 3/6
10/24/24, 10:08 AM Deep Learning - IIT Ropar - - Unit 7 - week 4
7) Consider a gradient profile ∇W = [1, 0.9, 0.6, 0.01, 0.1, 0.2, 0.5, 0.55, 0.56]. 1 point
Assume v−1 = 0, ϵ = 0, β = 0.9 and the learning rate is η −1 = 0.1 . Suppose that we use the
Adagrad algorithm then what is the value of η 6 = η/sqrt(vt + ϵ)?
0.03
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 4/6
10/24/24, 10:08 AM Deep Learning - IIT Ropar - - Unit 7 - week 4
0.06
0.08
0.006
8) Which of the following can help avoid getting stuck in a poor local minimum while 1 point
training a deep neural network?
9) What are the two main components of the ADAM optimizer? 1 point
Activation functions transform the output of a neuron into a non-linear function, allowing
the network to learn complex patterns.
Activation functions make the network faster by reducing the number of iterations needed
for training.
Activation functions are used to normalize the input data.
Activation functions are used to compute the loss function.
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 5/6
10/24/24, 10:08 AM Deep Learning - IIT Ropar - - Unit 7 - week 4
https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=59&assessment=288 6/6