[Advanced Learning Algorithms] Diagnostics / Bias & Variance

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

My Progress

[Advanced Learning Algorithms] Diagnostics / Bias & Variance - 3 본문

AI/ML Specialization

[Advanced Learning Algorithms] Diagnostics / Bias & Variance - 3

ghwangbo 2024. 1. 16. 20:24

1. Advice for applying machine learning

1-1. Intuition

In order to make a efficient machine learning model, we have to make a good decision.

Then, how do we make a good decision?

Ex)
Problem: We implemented a regularized linear regression on housing prices, but the accuracy of the model is low.

Possible answer: More training examples, try smaller or bigger features, add polynomial features, increase and decrease lambda, the learning rate.

Among these possible answers, we have to carry out diagnostics to find the best answer.

1-2. Ways of Diagnostics

1. Evaluating a model

Case 1: Model fits the training data well but will fail to generalize to new examples that are not in the training set.

Way 1: Splitting the training examples into training set and test set. (ex: 70% / 30%)
then compute the test error.

J_test(w, b) = 1/(2mtest)[np.sum(pow(f_func(x[i]) - y[i]), 2)]

Application: We can create a classification model that classifies the fraction of the test and the training set those were misclassified.

Way 2: Splitting the training examples into 3 set: training set, cross validation, and test set.

Application: Find the model with the lowest cross validation error then test with the test set

2. Bias and Variance

2-1 Diagnosing Bias and Variance

There are two ways to diagnose the bias and variance problem of the machine learning model.

Way 1. By looking at the graph

Way 2. By looking at the model's performance on the train set and the cross validation set.

Underfit / Bias: J_train is high, J_cv is high
Good fit: J_train is low, J_cv is low
Overfit / Variance: J_train is low, J_cv is high

This learning curves graph well shows the model's performance as the degree of polynomial increases.

As the degree of polynomial increases, the model overfits.
As the degree of polynomial decerases, the model underfits.

Summary

High Bias(Underfit)

- J_train will be high

- J_train == J_cv
High Variance(Overfit)

- J_cv >> J_train / J_train may be low
High bias and Variance
- J_train will be high

- J_cv >> J_train

2-2 Establishing a baseline level of performance

But how are we going to determine whether J_train or J_cv is high or low?
We can measure them by establishing a baseline level of performance.

Example) Speech Recognition

After training and testing the model, we got 10.8 % of train error and 14.8 % of cross validation error.

Q: How can we determine the performance of the model based on these errors?

A: We have to take in account how high the train error is compared to the human level performance.

What is the level of error you can resonably hope to get?

- Human level of performance

- Competing algorithms performance

- Guess based on experience

3. Learning Curve

Learning Curve is a graph that visualizes the relationship between the train error and the cross validation error.

We easily diagnose bias and variance problem using the graph.

Underfitting Case

This graph uses a quadratic function as an example.

As the training set size increases, the training error tends to flatten out starting at the certain point.
This implies that the algorithms's performance will not improve starting at the certain point as you increase the size of the training set.

It also illustrates that as the training set size increases, it will be more difficult for algorithm to fit the training set.

Overfitting Case

As you can see in the model, increasing the training set might help to increase the performance of the model.

***Downside of plotting the model

- Plotting the model might be too expensive to train many different model with the different subset of the training set.

4. Different approaches to improve the model

Problem:

The algorithm makes large errors in prediction. What do you try next?

Answer:

Get more training examples fixes high variance

Try samller sets of features fixes high variance

Try getting additional features fixes high bias

Try adding polynomial features fixes high bias

Try decreasing lambda(regularization parameter) fixes high bias

Try increasing lambda(regularization parameter) fixes high variance

5. Bias/Variance and Neural Networks

When creating a model, we have to well balance the complexity of the model. (Between high bias and variance)

However, Neural Network gives you the solution.

Large Neural Networks are low bias machines
When we make the neural network large, we will accurately fit the training set.
A larget neural network will usally do as well or better than a smaller one so long as regularization is chosen appropriately

Neural Network Diagnostic Guideline

void neuralnetwork() {
    if(Does it do well on the training set?) {
        if(Does it do well on the CV set?)
        	//Done
            return;
        }
        else {
        	//More data
            neuralnetwork();
        }
    }
    else {
    	//need bigger network
        neuralnetwork();
    }
}

'AI > ML Specialization' 카테고리의 다른 글

[Advanced Learning Algorithms] Decision Trees - 5 (0)	2024.01.22
[Advanced Learning Algorithms] Machine Learning Development Process - 4 (0)	2024.01.22
[Advanced Learning Algorithms] Neural Network Training - 2 (0)	2024.01.16
[Advanced Learning Algorithms] Neural Network - 1 (0)	2023.08.03
[Supervised ML] Review - 11 (0)	2023.08.03

'AI/ML Specialization' Related Articles