0% found this document useful (0 votes)
78 views

Congratulations! You Passed!: Shallow Neural Networks

The document is a quiz on shallow neural networks from an online Coursera course. It contains 10 multiple choice questions testing understanding of key concepts like: - Notation used to represent matrices and vectors in neural networks - Benefits of tanh activation over sigmoid for hidden units - Correct implementation of forward propagation for a neural network layer - Recommended activation for binary classification in the output layer - Shape of output from summing over an axis in NumPy - Effects of initializing weights and biases to zero The quiz questions cover foundational topics on neural network architecture and implementation.

Uploaded by

Lotfi H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Congratulations! You Passed!: Shallow Neural Networks

The document is a quiz on shallow neural networks from an online Coursera course. It contains 10 multiple choice questions testing understanding of key concepts like: - Notation used to represent matrices and vectors in neural networks - Benefits of tanh activation over sigmoid for hidden units - Correct implementation of forward propagation for a neural network layer - Recommended activation for binary classification in the output layer - Shape of output from summing over an axis in NumPy - Effects of initializing weights and biases to zero The quiz questions cover foundational topics on neural network architecture and implementation.

Uploaded by

Lotfi H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

11/8/2017 Coursera | Online Courses From Top Universities.

Join for Free | Coursera

Shallow Neural Networks 10/10 points (100%)


Quiz, 10 questions

 Congratulations! You passed! Next Item

1/1
 points

1.
Which of the following are true? (Check all that apply.)

X is a matrix in which each row is one training example.

Un-selected is correct 

a
[2]
denotes the activation vector of the 2nd layer.

Correct 

is the activation output of the 2nd layer for the 4th training example
[2]
a
4

Un-selected is correct 

X is a matrix in which each column is one training example.

Correct 

is the activation output by the 4th neuron of the 2nd layer


[2]
a
4

Correct 

denotes the activation vector of the 2 layer for the 12 training


[2](12) nd th
a

example.

Correct 

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 1/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

a denotes activation vector of the 12th layer on the 2nd training


[2](12)

example.
Shallow Neural Networks 10/10 points (100%)
Quiz, 10 questions Un-selected is correct 

1/1
 points

2.
The tanh activation usually works better than sigmoid activation function for hidden
units because the mean of its output is closer to zero, and so it centers the data better
for the next layer. True/False?

True

Correct 
Yes. As seen in lecture the output of the tanh is between -1 and 1, it thus centers
the data which makes the learning simpler for the next layer.

False

1/1
 points

3.
Which of these is a correct vectorized implementation of forward propagation for layer l,
where 1 ≤ l ≤ L?

[l] [l] [l] [l]


Z = W A +b

[l+1] [l] [l]


A = g (Z )

[l] [l] [l−1] [l]


Z = W A +b

[l] [l] [l]


A = g (Z )

Correct 

[l] [l] [l] [l]


Z = W A +b

[l+1] [l+1] [l]


A = g (Z )

[l] [l−1] [l] [l−1]


Z = W A +b

[l] [l] [l]


A = g (Z )

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 2/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

Shallow Neural Networks 10/10 points (100%)


1/1
Quiz, 10 questions
 points

4.
You are building a binary classi er for recognizing cucumbers (y=1) vs. watermelons
(y=0). Which one of these activation functions would you recommend using for the
output layer?

ReLU

Leaky ReLU

sigmoid

Correct 
Yes. Sigmoid outputs a value between 0 and 1 which makes it a very good choice
for binary classi cation. You can classify as 0 if the output is less than 0.5 and
classify as 1 if the output is more than 0.5. It can be done with tanh as well but it
is less convenient as the output is between -1 and 1.

tanh

1/1
 points

5.
Consider the following code:

1 A = np.random.randn(4,3)
2 B = np.sum(A, axis = 1, keepdims = True)

What will be B.shape? (If you’re not sure, feel free to run this in python to nd out).

(4, )

(1, 3)

(, 3)

(4, 1)

Correct 
Yes, we use (keepdims = True) to make sure that A.shape is (4,1) and not (4, ). It
makes our code more rigorous.

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 3/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

Shallow Neural Networks 10/10 points (100%)


Quiz, 10 questions
1/1
 points

6.
Suppose you have built a neural network. You decide to initialize the weights and biases
to be zero. Which of the following statements is true?

Each neuron in the rst hidden layer will perform the same computation. So
even after multiple iterations of gradient descent each neuron in the layer will
be computing the same thing as other neurons.

Correct 

Each neuron in the rst hidden layer will perform the same computation in
the rst iteration. But after one iteration of gradient descent they will learn to
compute di erent things because we have “broken symmetry”.

Each neuron in the rst hidden layer will compute the same thing, but
neurons in di erent layers will compute di erent things, thus we have
accomplished “symmetry breaking” as described in lecture.

The rst hidden layer’s neurons will perform di erent computations from
each other even in the rst iteration; their parameters will thus keep evolving
in their own way.

1/1
 points

7.
Logistic regression’s weights w should be initialized randomly rather than to all zeros,
because if you initialize to all zeros, then logistic regression will fail to learn a useful
decision boundary because it will fail to “break symmetry”, True/False?

True

False

Correct 
Yes, Logistic Regression doesn't have a hidden layer. If you initialize the weights
to zeros, the rst example x fed in the logistic regression will output zero but the
derivatives of the Logistic Regression depend on the input x (because there's no
hidden layer) which is not zero. So at the second iteration, the weights values
follow x's distribution and are di erent from each other if x is not a constant
vector.
https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 4/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

Shallow Neural Networks 10/10 points (100%)


Quiz, 10 questions
1/1
 points

8.
You have built a network using the tanh activation for all the hidden units. You initialize
the weights to relative large values, using np.random.randn(..,..)*1000. What will
happen?

This will cause the inputs of the tanh to also be very large, thus causing
gradients to be close to zero. The optimization algorithm will thus become
slow.

Correct 
Yes. tanh becomes at for large values, this leads its gradient to be close to zero.
This slows down the optimization algorithm.

This will cause the inputs of the tanh to also be very large, causing the units to
be “highly activated” and thus speed up learning compared to if the weights
had to start from small values.

It doesn’t matter. So long as you initialize the weights randomly gradient


descent is not a ected by whether the weights are large or small.

This will cause the inputs of the tanh to also be very large, thus causing
gradients to also become large. You therefore have to set α to be very small to
prevent divergence; this will slow down learning.

1/1
 points

9.

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 5/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

Consider the following 1 hidden layer neural network:

Shallow Neural Networks 10/10 points (100%)


Quiz, 10 questions

Which of the following statements are True? (Check all that apply).

W
[1]
will have shape (2, 4)

Un-selected is correct 

will have shape (4, 1)


[1]
b

Correct 

will have shape (4, 2)


[1]
W

Correct 

will have shape (2, 1)


[1]
b

Un-selected is correct 

W
[2]
will have shape (1, 4)

Correct 

b
[2]
will have shape (4, 1)

Un-selected is correct 

W
[2]
will have shape (4, 1)

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 6/7
11/8/2017 Coursera | Online Courses From Top Universities. Join for Free | Coursera

Un-selected is correct 

Shallow Neural Networks 10/10 points (100%)


Quiz, 10 questions [2]
b will have shape (1, 1)

Correct 

1/1
 points

10.
In the same network as the previous question, what are the dimensions of Z [1] and
?
[1]
A

and A are (4,2)


[1] [1]
Z

and A are (4,m)


[1] [1]
Z

Correct 

and A are (4,1)


[1] [1]
Z

and A are (1,4)


[1] [1]
Z

  

https://www.coursera.org/learn/neural-networks-deep-learning/exam/7lInI/shallow-neural-networks 7/7

You might also like