본문 바로가기
Coursera/Deep Learning Specialization

Neural Networks and Deep Learning (3)

by Fresh Red 2024. 11. 14.
728x90
반응형

▤ 목차

    Neural Networks Basics

    Logistic Regression as a Neural Network

    728x90

    Computation Graph

    Blue arrows: Forward propagation, Red arrows: Backward propagation

    A computation graph is an organized forward pass (propagation) to compute the function of a neural network, followed by a backward pass (propagation) to calculate the gradients of a neural network.

    This is like looking at a math formula and calculating step by step to get the answer.

     

    One step of ________ propagation on a computation graph yields the derivative of the final output variable.

    더보기

    Backward

    Derivatives with a Computation Graph

    01

    We are computing the derivatives of each variable with respect to our output variable (function).

    By doing so we can see how much each variable affects the output.

    This step of computing derivatives for each variable is called backpropagation.

    The point of backpropagation is to compute the derivatives of all variables, see how much each variable impacts the output, and update the variables using gradient descent.

    And to easily calculate the derivative of variables far away from the output, we use the chain rule.

    The chain rule comes from calculus and is a multiplication of partial derivatives to find the desired variables’ derivatives.

     

    In this class, what does the coding convention dvar represent?

    더보기

    The derivative of a final output variable with respect to various intermediate quantities.

    Logistic Regression Gradient Descent

    01

    What is the simplified formula for the derivative of the loss with respect to z?

    더보기

    a - y

    Gradient Descent on m Examples

    01

    We are taking the average of the derivatives of each variable on m examples.

    Get the derivatives of each variable in each example, sum them up, and average them.

     

    In the for loop depicted in the video, why is there only one dw variable (i.e. no i superscripts in the for loop)?

    더보기

    The value of dw in the code is cumulative.

    Derivative of DL/dz

    Refer to below articles those interested in the math of computing the DL/dz derivative.

    All the information provided is based on the Deep Learning Specialization | Coursera from DeepLearning.AI

    728x90
    반응형

    home top bottom
    }