본문 바로가기
Coursera/Deep Learning Specialization

Neural Networks and Deep Learning (11)

by Fresh Red 2024. 12. 4.
728x90
반응형

▤ 목차

    Deep Neural Networks

    Quiz

    728x90

    Q1

    What is stored in the 'cache' during forward propagation for later use in backward propagation?

    1. $W^{[l]}$
    2. $Z^{[l]}$
    3. $b^{[l]}$
    4. $A^{[l]}$

    Answer

    더보기

    2

    Yes. This value is useful in the calculation of $dW^{[l]}$ in the backward propagation.

    Q2

    We use the “cache” in implementing forward and backward propagation to pass useful values to the next layer in the forward propagation. True/False?

    Answer

    더보기

    False

    Correct. The "cache" is used in our implementation to store values computed during forward propagation to be used in backward propagation.

    Q3

    Which ones are "hyperparameters"? (Check all that apply.)

    1. activation values $a^{[l]}$
    2. learning rate $\alpha$
    3. size of the hidden layers $n^{[l]}$
    4. weight matrices $W^{[l]}$
    5. bias vectors $b^{[l]}$
    6. number of iterations
    7. number of layers $L$ in the neural network

    Answer

    더보기

    2, 3, 6, 7

    Hyperparameters are the parameters that we can change to optimize our networks.

    Q4

    Which of the following are the “parameters” of a neural network? (Check all that apply.)

    1. $W^{[l]}$ the weight matrices.
    2. $g^{[l]}$ the activation functions.
    3. $L$ the number of layers of the neural network.
    4. $b^{[l]}$ the bias vector.

    Answer

    더보기

    1, 4

    Correct. The weight matrices and the bias vectors are the parameters of the network.

    Q5

    Which of the following statements is true?

    1. The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
    2. The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.

    Answer

    Q6

    Vectorization allows us to compute $a^{[l]}$ for all the examples on a batch at the same time without using a for loop. True/False?

    Answer

    더보기

    True

    Correct. Vectorization allows us to compute the activation for all the training examples at the same time, avoiding the use of a for loop.

    Q7

    Vectorization allows you to compute forward propagation in an $L$-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers $l=1, 2, …, L$. True/False?

    Answer

    더보기

    False

    Forward propagation propagates the input through the layers, although for shallow networks we may just write all the lines $(a^{[2]} = g^{[2]}(z^{[2]}), z^{[2]} = W^{[2]}a^{[1]} + b^{[2]}, \dots)$ in a deeper network, we cannot avoid a for loop iterating over the layers: $(a^{[l]} = g^{[l]}(z^{[l]}), z^{[l]} = W^{[l]}a^{[l-1]} + b^{[l]}, \dots)$.

    Q8

    During forward propagation, in the forward function for layer $l$ you need to know what is the activation function in a layer (sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what the activation function for the layer $l$ is, since the gradient depends on it. True/False?

    Answer

    더보기

    True

    Yes, as you've seen in week 3, each activation has a different derivative. Thus, during backpropagation, you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

    Q9

    If L is the number of layers of a neural network then $dZ^{[L]} = A^{[L]}−Y$. True/False?

    Answer

    더보기

    True

    Yes. The gradient of the output layer depends on the difference between the value computed during the forward propagation process and the target values.

    Q10

    For any mathematical function you can compute with an L-layered deep neural network with N hidden units there is a shallow neural network that requires only $\textrm{log}\,N$ units, but it is very difficult to train.

    Answer

    더보기

    False

    Correct. On the contrary, some mathematical functions can be computed using an L-layered neural network and a given number of hidden units; but using a shallow neural network the number of necessary hidden units grows exponentially.

    Q11

    There are certain functions with the following properties: (i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

    Answer

    더보기

    True

    Q12

    A shallow neural network with a single hidden layer and 6 hidden units can compute any function that a neural network with 2 hidden layers and 6 hidden units can compute. True/False?

    Hint

    더보기

    As seen during the lectures there are functions you can compute with a "small" L-layer deep neural network that shallower networks require exponentially more hidden units to compute.

    Answer

    더보기

    False

    Q13

    In the general case if we are training with $m$ examples what is the shape of $A^{[l]}$?

    1. $(n^{[l]}, m)$
    2. $(m, n^{[l]})$
    3. $(m, n^{[l+1]})$
    4. $(n^{[l+1]}, m)$

    Hint

    더보기

    The number of rows in $A^{[1]}$ corresponds to the number of units in the l-th layer.

    Answer

    All the information provided is based on the Deep Learning Specialization | Coursera from DeepLearning.AI

    728x90
    반응형

    home top bottom
    }