Gradients and Gradient Descent
Gradients
Introduction to Tangent planes
Just like we had a tangent line in a 1-dimensional line by computing derivatives, we have a tangent plane, which is a slope or a derivative of the functions with multiple variables.
To find the tangent plane we use gradient descent to speed up the optimization.
A tangent is a straight line or plane that touches a curve or curved surface at a point, but if extended does not cross it at that point.
To get the tangent plane, we cut the planes/space to find the corresponding tangent lines to get the tangent plane that contains all tangent lines.
To cut the planes/space, we use partial derivatives.
Partial derivatives - Part 1
A partial derivative is a way to find a derivative of one variable and treat all other variables as a constant.
Partial derivatives - Part 2
$6xy^3$
Correct! Keeping y constant and performing the calculation you got the result!

$9x^2y^2$
Correct. Keeping the x constant and differentiating with respect to y gives you the result!

Gradients
A gradient is simply a collection of partial derivatives in a vector.
$\left[\begin{array}{c}4\\6\end{array}\right]$

Gradients and maxima/minima
To get the maximum/minimum point, we solve for the slope to be 0.
Optimization with gradients: An example
Quiz 1: Using the expanded form of the function $f(x, y) = 85 - {1\over 90}x^2(x-6)y^2(y-6)$ written as $= 85-{1\over 90}x^3y^3 + {1\over 15}x^3y^2+{1\over 15}x^2y^3 - {2\over 5}x^2y^2$, find ${\partial f\over \partial x}$
$-{1\over 30}x^2y^3 + {1\over 5}x^2y^2+{2\over 15}xy^3 - {4\over 5}xy^2$
Correct! This is the same expression as the lecturer is going to show, but it will be shown in a factored form, so it is easier to find the zeros!
Quiz 2: Using the expanded form of the function $f(x, y) = 85 - {1\over 90}x^2(x-6)y^2(y-6)$ written as $= 85-{1\over 90}x^3y^3 + {1\over 15}x^3y^2+{1\over 15}x^2y^3 - {2\over 5}x^2y^2$, find ${\partial f\over \partial y}$
$-{1\over 30}x^3y^2 + {2\over 15}x^3y+{1\over 5}x^2y^2 - {4\over 5}x^2y$
Correct! This is the same expression as the lecturer is going to show, but it will be shown in a factored form, so it is easier to find the zeros!
We can solve for x and y and then apply them to see the minimum point.
Optimization using gradients - Analytical method
To minimize the sum of squares cost, we need to solve for the derivatives.
Compute ${\partial E\over \partial m}$
$28m + 12b - 42$
Compute ${\partial E\over \partial b}$
$6b+12m-20$
We use gradient descent to solve for partial derivatives quickly.
All the information provided is based on the Calculus for Machine Learning and Data Science | Coursera from DeepLearning.AI
'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글
Calculus for Machine Learning and Data Science (8) (1) | 2024.08.29 |
---|---|
Calculus for Machine Learning and Data Science (7) (2) | 2024.08.28 |
Calculus for Machine Learning and Data Science (5) (0) | 2024.08.26 |
Calculus for Machine Learning and Data Science (4) (0) | 2024.08.25 |
Calculus for Machine Learning and Data Science (3) (0) | 2024.08.24 |