Derivatives and Optimization
After completing this course, you will be able to:
Learning Objectives
- Perform gradient descent in neural networks with different activation and cost functions
- Visually interpret the differentiation of different types of functions commonly used in machine learning
- Approximately optimize different types of functions commonly used in machine learning using first-order (gradient descent) and second-order (Newton’s method) iterative methods
- Analytically optimize different functions commonly used in machine learning using properties of derivatives and gradients.
Derivatives
Machine Learning Motivation
Derivatives are used to optimize, either maximize or minimize, functions.
Model training looks for ways to optimize for the best result.
Starts with a random line and then tweaks the line to the most suitable location to either find a best-fitting line that crosses the points (regression problem) or divides the points (classification problem), this process is called optimization.
Motivation to Derivatives - Part I
It’s similar to velocity, where the velocity can change depending on the distance at that instant and it’s called instantaneous velocity, which is like a derivative.
A derivative is the instantaneous rate of change of a function.
Using the velocity example, a function is the distance and the derivative is the velocity.
No
You’re right! The car is not moving at a constant speed. You can see that between 10 - 15 seconds (5 seconds interval), it traveled 80 meters but between 15 - 20 seconds (5 seconds interval), it traveled 73 meters.
No
Correct! While you do not have enough data to determine the velocity at t=12.5 seconds, you can only tell the average velocity between 10 and 15!
Choice
- Velocity = 2.5 m/s
- Velocity = 16m/s
- Velocity = 80m/s
- Velocity = 17.2m/s
Answer
2
Correct. Velocity = distance traveled / time. Since you want to find the avg. velocity between 10-15 seconds, the table gives us information about the distance traveled t = 10s to t = 15s. Velocity = y(15) - y(10) / 5s = 202m - 122m / 5 s = 80m/5s = 16m/s.
Velocity = distance traveled / time. To find the velocity at t = 12.5 seconds, you use the distance traveled from t = 12 to t= 13. y(13) - y(12) / 1 = 15m/s.
We got the rough estimate at time t = 12.5, but to get better estimates, we need finer data to calculate the slope.
Derivatives and Tangents
A measure of how fast the distance is changing to time is called the instantaneous rate of change and it is the slope at a given point.
The instantaneous rate of change is a measure of how fast the relation between two variables is changing at any point.
All above explains what the derivative is, so the derivative of a function is precisely the slope of the tangent at that particular point.
A tangent is a straight line or plane that touches a curve or curved surface at a point, but if extended does not cross it at that point.
Slopes, maxima, and minima
Where was the velocity of your car zero?
Choice
- t = 10 min
- t = 16 min
- t = 22 min
- t = 24 min
- t = 34 min
- t = 40 min
Answer
All except t=24 min
Correct. The velocity at this time is zero.
At what time was the car farthest from its starting point (when the line moves upwards the car is going forward and vice versa)?
Choice
- t = 10 min
- t = 16 min
- t = 22 min
- t = 24 min
- t = 35 min
- t = 40 min
Answer
1
Correct. The distance here is 50km measure from the car’s starting point.
The point where the car is the farthest is where the car was stopped and that means if we want to find the maximum or the minimum in a function, it occurs at one of the points where the derivative is zero.
Derivatives and their notation
All the information provided is based on the Calculus for Machine Learning and Data Science | Coursera from DeepLearning.AI
'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글
Calculus for Machine Learning and Data Science (3) (0) | 2024.08.24 |
---|---|
Calculus for Machine Learning and Data Science (2) (0) | 2024.08.23 |
Calculus for Machine Learning and Data Science (0) (0) | 2024.08.21 |
Linear Algebra for Machine Learning and Data Science (17) (0) | 2024.08.20 |
Linear Algebra for Machine Learning and Data Science (16) (0) | 2024.08.19 |