▤ 목차
About this Course
In the second course of the Deep Learning Specialization, you will open the deep learning black box to systematically understand the processes that drive performance and generate good results.
By the end, you will learn the best practices to train and develop test sets and analyze bias/variance for building deep learning applications; be able to use standard neural network techniques such as initialization, L2 and dropout regularization, hyperparameter tuning, batch normalization, and gradient checking; implement and apply a variety of optimization algorithms, such as mini-batch gradient descent, Momentum, RMSprop and Adam, and check for their convergence; and implement a neural network in TensorFlow.
The Deep Learning Specialization is our foundational program that will help you understand the capabilities, challenges, and consequences of deep learning and prepare you to participate in developing leading-edge AI technology. It provides a pathway for you to gain the knowledge and skills to apply machine learning to your work, level up your technical career, and take the definitive step in the world of AI.
Practical Aspects of Deep Learning
Discover and experiment with a variety of different initialization methods, apply L2 regularization and dropout to avoid model overfitting, and then apply gradient checking to identify errors in a fraud detection model.
Learning Objectives
- Give examples of how different types of initializations can lead to different results
- Examine the importance of initialization in complex neural networks
- Explain the difference between train/dev/test sets
- Diagnose the bias and variance issues in your model
- Assess the right time and place for using regularization methods such as dropout or L2 regularization
- Explain Vanishing and Exploding gradients and how to deal with them
- Use gradient checking to verify the accuracy of your backpropagation implementation
- Apply zeros initialization, random initialization, and He initialization
- Apply regularization to a deep-learning model
Setting up your Machine Learning Application
Train / Dev / Test sets



Applying machine learning to a field is highly iterative because we can’t set the same hyperparameters to find a global optimum in all datasets or models.
We may find a set of hyperparameters that excels in performance with image data, but that doesn’t guarantee the same for other datasets.
One tip to efficiently experiment is to start small and go big.
Your dataset can be divided into 3 parts:
- Train set: Data that you use to train a model
- Hold out cross-validation (development “dev”) set: Data that you use to validate how well you trained your model
- Test set: Final evaluation of the model to see the robustness using the unseen data
Small data: Traditionally 60/20/20 was used to split the data (100 ~ 10,000 data)
Big data: With 1 million+, instead of a 60/20/20 split, 10,000 can be enough for validation and testing, which means the ratio would be 98/01/01 and focus on training a model
Training sets can have cat pictures from web pages with higher resolution.
Dev/test sets can have cat pictures from users using your app with lower resolution (we want similar resolution/distribution inputted when the model is deployed).
- Make sure the dev and test come from the same distribution
- The training set doesn’t necessarily have to follow the same distribution as the dev/test sets, since the training set hungers for data
Bias / Variance



High bias = underfitting
High variance = overfitting
In deep learning, we can determine the above via train set and dev set errors because we can’t plot high-dimensional data.
Example:
Cat classifier | High variance | High bias | High bias & High variance |
Low bias & Low variance |
Train set error | 1% | 15% | 15% | 0.5% |
Dev set error | 11% | 16% | 30% | 1% |
The above figures would depend on the optimal (Bayes) error
- If the error is 0% then the last column would be ideal
- If the error is 15% (e.g. blurred images) then the second column would be ideal as well
High bias and high variance sound bizarre as we generally think bias and variance have a trade-off relationship.
However, high bias and high variance can exist in high dimensions when some parts of the data are overfitting and others underfitting.
Basic Recipe for Machine Learning

When you discover high bias (training data performance):
- Try a bigger network (larger hidden units, more layers)
- Train longer
- Search neural network architecture
Once high bias is reduced, check for variance, and if high variance (dev set performance):
- Get more data
- Regularization
- Search neural network architecture
Bias variance tradeoff:
Traditional machine learning was that one goes up the other goes down. For instance, if bias goes up, then the variance goes down.
All the information provided is based on the Deep Learning Specialization | Coursera from DeepLearning.AI
'Coursera > Deep Learning Specialization' 카테고리의 다른 글
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, and Optimization (1) (0) | 2024.12.10 |
---|---|
Neural Networks and Deep Learning (11) (0) | 2024.12.04 |
Neural Networks and Deep Learning (10) (0) | 2024.11.28 |
Neural Networks and Deep Learning (9) (0) | 2024.11.27 |
Neural Networks and Deep Learning (8) (0) | 2024.11.26 |