일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
- llama
- GPT
- supervised ml
- Unsupervised Learning
- 인공신경망
- 딥러닝
- Machine Learning
- neural network
- ML
- feature scaling
- 챗지피티
- Scikitlearn
- AI
- nlp
- Regression
- AI 트렌드
- coursera
- 머신러닝
- Deep Learning
- learning algorithms
- feature engineering
- Supervised Learning
- 인공지능
- Andrew Ng
- LLM
- prompt
- bingai
- ChatGPT
- 언어모델
- 프롬프트 엔지니어링
- Today
- Total
My Progress
[Advanced Learning Algorithms] Diagnostics / Bias & Variance - 3 본문
[Advanced Learning Algorithms] Diagnostics / Bias & Variance - 3
ghwangbo 2024. 1. 16. 20:241. Advice for applying machine learning
1-1. Intuition
In order to make a efficient machine learning model, we have to make a good decision.
Then, how do we make a good decision?
Ex)
Problem: We implemented a regularized linear regression on housing prices, but the accuracy of the model is low.
Possible answer: More training examples, try smaller or bigger features, add polynomial features, increase and decrease lambda, the learning rate.
Among these possible answers, we have to carry out diagnostics to find the best answer.
1-2. Ways of Diagnostics
1. Evaluating a model
Case 1: Model fits the training data well but will fail to generalize to new examples that are not in the training set.
Way 1: Splitting the training examples into training set and test set. (ex: 70% / 30%)
then compute the test error.
J_test(w, b) = 1/(2mtest)[np.sum(pow(f_func(x[i]) - y[i]), 2)]
Application: We can create a classification model that classifies the fraction of the test and the training set those were misclassified.
Way 2: Splitting the training examples into 3 set: training set, cross validation, and test set.
Application: Find the model with the lowest cross validation error then test with the test set
2. Bias and Variance
2-1 Diagnosing Bias and Variance
There are two ways to diagnose the bias and variance problem of the machine learning model.
Way 1. By looking at the graph
Way 2. By looking at the model's performance on the train set and the cross validation set.
Underfit / Bias: J_train is high, J_cv is high
Good fit: J_train is low, J_cv is low
Overfit / Variance: J_train is low, J_cv is high
This learning curves graph well shows the model's performance as the degree of polynomial increases.
As the degree of polynomial increases, the model overfits.
As the degree of polynomial decerases, the model underfits.
Summary
High Bias(Underfit)
- J_train will be high
- J_train == J_cv
High Variance(Overfit)
- J_cv >> J_train / J_train may be low
High bias and Variance
- J_train will be high
- J_cv >> J_train
2-2 Establishing a baseline level of performance
But how are we going to determine whether J_train or J_cv is high or low?
We can measure them by establishing a baseline level of performance.
Example) Speech Recognition
After training and testing the model, we got 10.8 % of train error and 14.8 % of cross validation error.
Q: How can we determine the performance of the model based on these errors?
A: We have to take in account how high the train error is compared to the human level performance.
What is the level of error you can resonably hope to get?
- Human level of performance
- Competing algorithms performance
- Guess based on experience
3. Learning Curve
Learning Curve is a graph that visualizes the relationship between the train error and the cross validation error.
We easily diagnose bias and variance problem using the graph.
Underfitting Case
This graph uses a quadratic function as an example.
As the training set size increases, the training error tends to flatten out starting at the certain point.
This implies that the algorithms's performance will not improve starting at the certain point as you increase the size of the training set.
It also illustrates that as the training set size increases, it will be more difficult for algorithm to fit the training set.
Overfitting Case
As you can see in the model, increasing the training set might help to increase the performance of the model.
***Downside of plotting the model
- Plotting the model might be too expensive to train many different model with the different subset of the training set.
4. Different approaches to improve the model
Problem:
The algorithm makes large errors in prediction. What do you try next?
Answer:
Get more training examples fixes high variance
Try samller sets of features fixes high variance
Try getting additional features fixes high bias
Try adding polynomial features fixes high bias
Try decreasing lambda(regularization parameter) fixes high bias
Try increasing lambda(regularization parameter) fixes high variance
5. Bias/Variance and Neural Networks
When creating a model, we have to well balance the complexity of the model. (Between high bias and variance)
However, Neural Network gives you the solution.
- Large Neural Networks are low bias machines
- When we make the neural network large, we will accurately fit the training set.
- A larget neural network will usally do as well or better than a smaller one so long as regularization is chosen appropriately
Neural Network Diagnostic Guideline
void neuralnetwork() {
if(Does it do well on the training set?) {
if(Does it do well on the CV set?)
//Done
return;
}
else {
//More data
neuralnetwork();
}
}
else {
//need bigger network
neuralnetwork();
}
}
'AI > ML Specialization' 카테고리의 다른 글
[Advanced Learning Algorithms] Decision Trees - 5 (0) | 2024.01.22 |
---|---|
[Advanced Learning Algorithms] Machine Learning Development Process - 4 (0) | 2024.01.22 |
[Advanced Learning Algorithms] Neural Network Training - 2 (0) | 2024.01.16 |
[Advanced Learning Algorithms] Neural Network - 1 (0) | 2023.08.03 |
[Supervised ML] Review - 11 (0) | 2023.08.03 |