본문 바로가기

Coursera/Deep Learning Specialization

Neural Networks and Deep Learning (11)

728x90

Deep Neural Networks

Quiz

728x90

Q1

What is stored in the 'cache' during forward propagation for later use in backward propagation?

  1. $W^{[l]}$
  2. $Z^{[l]}$
  3. $b^{[l]}$
  4. $A^{[l]}$

Answer

더보기

2

Yes. This value is useful in the calculation of $dW^{[l]}$ in the backward propagation.

Q2

We use the “cache” in implementing forward and backward propagation to pass useful values to the next layer in the forward propagation. True/False?

Answer

더보기

False

Correct. The "cache" is used in our implementation to store values computed during forward propagation to be used in backward propagation.

Q3

Which ones are "hyperparameters"? (Check all that apply.)

  1. activation values $a^{[l]}$
  2. learning rate $\alpha$
  3. size of the hidden layers $n^{[l]}$
  4. weight matrices $W^{[l]}$
  5. bias vectors $b^{[l]}$
  6. number of iterations
  7. number of layers $L$ in the neural network

Answer

더보기

2, 3, 6, 7

Hyperparameters are the parameters that we can change to optimize our networks.

Q4

Which of the following are the “parameters” of a neural network? (Check all that apply.)

  1. $W^{[l]}$ the weight matrices.
  2. $g^{[l]}$ the activation functions.
  3. $L$ the number of layers of the neural network.
  4. $b^{[l]}$ the bias vector.

Answer

더보기

1, 4

Correct. The weight matrices and the bias vectors are the parameters of the network.

Q5

Which of the following statements is true?

  1. The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
  2. The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.

Answer

Q6

Vectorization allows us to compute $a^{[l]}$ for all the examples on a batch at the same time without using a for loop. True/False?

Answer

더보기

True

Correct. Vectorization allows us to compute the activation for all the training examples at the same time, avoiding the use of a for loop.

Q7

Vectorization allows you to compute forward propagation in an $L$-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers $l=1, 2, …, L$. True/False?

Answer

더보기

False

Forward propagation propagates the input through the layers, although for shallow networks we may just write all the lines $(a^{[2]} = g^{[2]}(z^{[2]}), z^{[2]} = W^{[2]}a^{[1]} + b^{[2]}, \dots)$ in a deeper network, we cannot avoid a for loop iterating over the layers: $(a^{[l]} = g^{[l]}(z^{[l]}), z^{[l]} = W^{[l]}a^{[l-1]} + b^{[l]}, \dots)$.

Q8

During forward propagation, in the forward function for layer $l$ you need to know what is the activation function in a layer (sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what the activation function for the layer $l$ is, since the gradient depends on it. True/False?

Answer

더보기

True

Yes, as you've seen in week 3, each activation has a different derivative. Thus, during backpropagation, you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

Q9

If L is the number of layers of a neural network then $dZ^{[L]} = A^{[L]}−Y$. True/False?

Answer

더보기

True

Yes. The gradient of the output layer depends on the difference between the value computed during the forward propagation process and the target values.

Q10

For any mathematical function you can compute with an L-layered deep neural network with N hidden units there is a shallow neural network that requires only $\textrm{log}\,N$ units, but it is very difficult to train.

Answer

더보기

False

Correct. On the contrary, some mathematical functions can be computed using an L-layered neural network and a given number of hidden units; but using a shallow neural network the number of necessary hidden units grows exponentially.

Q11

There are certain functions with the following properties: (i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

Answer

더보기

True

Q12

A shallow neural network with a single hidden layer and 6 hidden units can compute any function that a neural network with 2 hidden layers and 6 hidden units can compute. True/False?

Hint

더보기

As seen during the lectures there are functions you can compute with a "small" L-layer deep neural network that shallower networks require exponentially more hidden units to compute.

Answer

더보기

False

Q13

In the general case if we are training with $m$ examples what is the shape of $A^{[l]}$?

  1. $(n^{[l]}, m)$
  2. $(m, n^{[l]})$
  3. $(m, n^{[l+1]})$
  4. $(n^{[l+1]}, m)$

Hint

더보기

The number of rows in $A^{[1]}$ corresponds to the number of units in the l-th layer.

Answer

All the information provided is based on the Deep Learning Specialization | Coursera from DeepLearning.AI

728x90