본문 바로가기

728x90

미적분학

Calculus for Machine Learning and Data Science (11) Optimization in Neural Networks and Newton’s MethodNewton’s MethodNewton's MethodNewton’s method is an alternative to the gradient descent.In principle, Newton’s method is used to find the zeros of a function (where f(x) = 0).This is simply the formula of subtracting the previous point $x$ from the slope we calculate.Since Newton’s method is to find a zero of a function (f), it behaves like the .. 더보기
Calculus for Machine Learning and Data Science (10) Optimization in Neural Networks and Newton’s MethodQuizQ1Given the single-layer perceptron described in the lectures:What should be replaced in the question mark?$w_1w_2+x_1x_2+b$$w_1x_1+w_2x_2+b_1+b_2$$w_1x_1+w_2x_2+b$$w_1x_2+w_2x_1+b$Answer더보기3Correct! In a single-layer perceptron, we evaluate a (weighted) linear combination of the inputs plus a constant term, representing the bias!Q2For a Reg.. 더보기
Calculus for Machine Learning and Data Science (9) Optimization in Neural Networks and Newton’s MethodOptimization in Neural NetworksRegression with a perceptronPerceptron can be seen as a linear regression, where inputs are multiplied with weights. We output a prediction using the formula wx + b and optimize the weights (w) and bias (b).We can think of a perceptron as a single node that does the computation/calculation.Regression with a percept.. 더보기
Calculus for Machine Learning and Data Science (8) Gradients and Gradient DescentGradient DescentOptimization using Gradient Descent in one variable - Part 1It’s hard to find the lowest point using the formula above.So we can try something different by trying points in both directions and then choosing a point that’s lower than the other 2.But this also takes a long time.The lecturer introduces a term, called the "learning rate", denoted by $a$ .. 더보기
Calculus for Machine Learning and Data Science (7) Gradients and Gradient DescentQuizQ1Given that $f(x,y) = x^2y+3x^2$, find its derivative with respect to $x$, i.e. find ${\partial f\over \partial x}$Answer더보기$2xy+6x$Q2Given that $f(x,y)=xy^2+2x+3y$, its gradient, i.e. $\nabla f(x,y)$ is:$\left[\begin{array}{c} 2xy+3\\ y^2+2 \end{array}\right]$$\left[\begin{array}{c} 2xy\\ 2x+3 \end{array}\right]$$\left[\begin{array}{c} y^2+2\\ 2xy+3 \end{array.. 더보기
Calculus for Machine Learning and Data Science (6) Gradients and Gradient DescentGradientsIntroduction to Tangent planesJust like we had a tangent line in a 1-dimensional line by computing derivatives, we have a tangent plane, which is a slope or a derivative of the functions with multiple variables.To find the tangent plane we use gradient descent to speed up the optimization.A tangent is a straight line or plane that touches a curve or curved .. 더보기
Calculus for Machine Learning and Data Science (5) Derivatives and OptimizationOptimizationIntroduction to optimizationWe use derivatives for optimization.Using the sauna example, we want to find the coolest place.Optimization is when we want to find the maximum or the minimum of the function.Local minima: Low points with zero slopes that are not the lowest points (orange arrows in the 3rd slide)Global minima: Lowest point with zero slope (blue .. 더보기
Calculus for Machine Learning and Data Science (4) Derivatives and OptimizationQuizQ1Consider the following lines.What can be said about the slopes at their intersection?Slope(Line 1) > Slope(Line 2).Slope(Line 1) Slope(Line 1) = Slope(Line 2).It is impossible to infer anything from the given information.Answer더보기2Correct! Line 2 is steeper than Line 1, therefore its slope is higher.Q2Given the following graph, what is the slope of the line? You.. 더보기

728x90